2025-12-04T09:15:48.9563241Z Current runner version: '2.330.0' 2025-12-04T09:15:48.9569357Z Runner name: 'i-04e943c8c4e7268ee' 2025-12-04T09:15:48.9570189Z Runner group name: 'default' 2025-12-04T09:15:48.9571113Z Machine name: 'ip-10-0-73-227' 2025-12-04T09:15:48.9574030Z ##[group]GITHUB_TOKEN Permissions 2025-12-04T09:15:48.9576400Z Contents: read 2025-12-04T09:15:48.9577000Z Metadata: read 2025-12-04T09:15:48.9577579Z ##[endgroup] 2025-12-04T09:15:48.9579515Z Secret source: Actions 2025-12-04T09:15:48.9580241Z Prepare workflow directory 2025-12-04T09:15:49.0078566Z Prepare all required actions 2025-12-04T09:15:49.0115405Z Getting action download info 2025-12-04T09:15:49.3899545Z Download action repository 'pytorch/test-infra@main' (SHA:39aa74d619174326f4e2fb0e216151c2f29d9ffd) 2025-12-04T09:15:51.8441558Z Download action repository 'pytorch/pytorch@main' (SHA:7716da9fb23f27a65b41f9f016a2afadf281c18f) 2025-12-04T09:16:07.9983525Z Download action repository 'actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065' (SHA:a26af69be951a213d495a4c3e4e4022e16d87065) 2025-12-04T09:16:08.3895131Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722) 2025-12-04T09:16:08.6512540Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2025-12-04T09:16:08.8674691Z Download action repository 'seemethere/download-artifact-s3@1da556a7aa0a088e3153970611f6c432d58e80e6' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-12-04T09:16:09.1337299Z Download action repository 'seemethere/upload-artifact-s3@baba72d0712b404f646cebe0730933554ebce96a' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-12-04T09:16:09.4805712Z Getting action download info 2025-12-04T09:16:09.6354046Z Download action repository 'actions/checkout@v4' (SHA:34e114876b0b11c390a56381ad16ebd13914f8d5) 2025-12-04T09:16:10.0044286Z Getting action download info 2025-12-04T09:16:10.1390224Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2025-12-04T09:16:10.3927632Z Getting action download info 2025-12-04T09:16:10.5266391Z Download action repository 'nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482' (SHA:3e91a01664abd3c5cd539100d10d33b9c5b68482) 2025-12-04T09:16:10.7468970Z Getting action download info 2025-12-04T09:16:10.8956005Z Uses: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/heads/main (ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32) 2025-12-04T09:16:10.8960026Z ##[group] Inputs 2025-12-04T09:16:10.8960436Z build-environment: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck 2025-12-04T09:16:10.8969115Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}]} 2025-12-04T09:16:10.8978234Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:16:10.8979045Z sync-tag: 2025-12-04T09:16:10.8979776Z timeout-minutes: 300 2025-12-04T09:16:10.8980033Z use-gha: 2025-12-04T09:16:10.8980249Z dashboard-tag: 2025-12-04T09:16:10.8980490Z s3-bucket: gha-artifacts 2025-12-04T09:16:10.8980757Z aws-role-to-assume: 2025-12-04T09:16:10.8981265Z disable-monitor: false 2025-12-04T09:16:10.8981560Z monitor-log-interval: 5 2025-12-04T09:16:10.8981868Z monitor-data-collect-interval: 1 2025-12-04T09:16:10.8982172Z ##[endgroup] 2025-12-04T09:16:10.8982918Z Complete job name: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T09:16:10.9672578Z A job started hook has been configured by the self-hosted runner administrator 2025-12-04T09:16:10.9771786Z ##[group]Run '/home/ec2-user/runner-scripts/before_job.sh' 2025-12-04T09:16:10.9783153Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:16:10.9783748Z ##[endgroup] 2025-12-04T09:16:12.4665914Z Runner Type: linux.g5.4xlarge.nvidia.gpu 2025-12-04T09:16:12.4666370Z Instance Type: g5.4xlarge 2025-12-04T09:16:12.4666637Z AMI Name: unknown 2025-12-04T09:16:12.4718877Z AMI ID: ami-08982f1c5bf93d976 2025-12-04T09:16:17.9617773Z ##[group]Run pytorch/test-infra/.github/actions/setup-ssh@main 2025-12-04T09:16:17.9618210Z with: 2025-12-04T09:16:17.9618733Z github-secret: *** 2025-12-04T09:16:17.9619431Z instructions: All testing is done inside the container, to start an interactive session run: docker exec -it $(docker container ps --format '{{.ID}}') bash 2025-12-04T09:16:17.9620182Z activate-with-label: false 2025-12-04T09:16:17.9620506Z label: with-ssh 2025-12-04T09:16:17.9620767Z remove-existing-keys: true 2025-12-04T09:16:17.9621040Z fail-silently: true 2025-12-04T09:16:17.9621317Z env: 2025-12-04T09:16:17.9621543Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:16:17.9621806Z ##[endgroup] 2025-12-04T09:16:18.1118779Z Please see https://github.com/pytorch/pytorch/wiki/Debugging-using-with-ssh-for-Github-Actions for more info. 2025-12-04T09:16:18.1120982Z Not on pull request and ciflow reference could not be extracted, skipping adding ssh keys 2025-12-04T09:16:18.1307683Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2025-12-04T09:16:18.1308105Z with: 2025-12-04T09:16:18.1308327Z no-sudo: true 2025-12-04T09:16:18.1308569Z submodules: recursive 2025-12-04T09:16:18.1308834Z fetch-depth: 0 2025-12-04T09:16:18.1309273Z env: 2025-12-04T09:16:18.1309492Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:16:18.1309772Z ##[endgroup] 2025-12-04T09:16:18.1384624Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T09:16:18.1385561Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T09:16:18.1400598Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:16:18.1400972Z env: 2025-12-04T09:16:18.1401226Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:16:18.1401529Z ##[endgroup] 2025-12-04T09:16:18.1493488Z ##[group]Run # Use all available CPUs for fetching 2025-12-04T09:16:18.1493934Z # Use all available CPUs for fetching 2025-12-04T09:16:18.1494295Z cd "${GITHUB_WORKSPACE}" 2025-12-04T09:16:18.1494639Z git config --global fetch.parallel 0 2025-12-04T09:16:18.1495037Z git config --global submodule.fetchJobs 0 2025-12-04T09:16:18.1495446Z  2025-12-04T09:16:18.1495966Z # Clean workspace. The default checkout action should also do this, but 2025-12-04T09:16:18.1496478Z # do it here as well just in case 2025-12-04T09:16:18.1496812Z if [[ -d .git ]]; then 2025-12-04T09:16:18.1497120Z  if [ -z "${NO_SUDO}" ]; then 2025-12-04T09:16:18.1525710Z  sudo git clean -ffdx 2025-12-04T09:16:18.1526076Z  else 2025-12-04T09:16:18.1526351Z  git clean -ffdx 2025-12-04T09:16:18.1526648Z  fi 2025-12-04T09:16:18.1526858Z fi 2025-12-04T09:16:18.1536387Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:16:18.1536785Z env: 2025-12-04T09:16:18.1536997Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:16:18.1537260Z NO_SUDO: true 2025-12-04T09:16:18.1537482Z ##[endgroup] 2025-12-04T09:16:18.1676605Z ##[group]Run actions/checkout@v4 2025-12-04T09:16:18.1676933Z with: 2025-12-04T09:16:18.1677221Z ref: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:16:18.1677578Z fetch-depth: 0 2025-12-04T09:16:18.1677828Z submodules: recursive 2025-12-04T09:16:18.1678098Z show-progress: false 2025-12-04T09:16:18.1678375Z repository: pytorch/pytorch 2025-12-04T09:16:18.1678771Z token: *** 2025-12-04T09:16:18.1678998Z ssh-strict: true 2025-12-04T09:16:18.1679245Z ssh-user: git 2025-12-04T09:16:18.1679519Z persist-credentials: true 2025-12-04T09:16:18.1679798Z clean: true 2025-12-04T09:16:18.1680057Z sparse-checkout-cone-mode: true 2025-12-04T09:16:18.1680365Z fetch-tags: false 2025-12-04T09:16:18.1680596Z lfs: false 2025-12-04T09:16:18.1680838Z set-safe-directory: true 2025-12-04T09:16:18.1681120Z env: 2025-12-04T09:16:18.1681337Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:16:18.1681612Z ##[endgroup] 2025-12-04T09:16:18.2827955Z Syncing repository: pytorch/pytorch 2025-12-04T09:16:18.2829215Z ##[group]Getting Git version info 2025-12-04T09:16:18.2829704Z Working directory is '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2025-12-04T09:16:18.2830368Z [command]/usr/bin/git version 2025-12-04T09:16:18.3033499Z git version 2.50.1 2025-12-04T09:16:18.3070464Z ##[endgroup] 2025-12-04T09:16:18.3081892Z Copying '/home/ec2-user/.gitconfig' to '/home/ec2-user/actions-runner/_work/_temp/bdea0e79-3cbc-4c50-bc53-0b817b81fbdd/.gitconfig' 2025-12-04T09:16:18.3105140Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/bdea0e79-3cbc-4c50-bc53-0b817b81fbdd' before making global git config changes 2025-12-04T09:16:18.3106249Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T09:16:18.3111204Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-12-04T09:16:18.3169582Z Deleting the contents of '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2025-12-04T09:16:18.3173120Z ##[group]Initializing the repository 2025-12-04T09:16:18.3178227Z [command]/usr/bin/git init /home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-12-04T09:16:18.3257727Z hint: Using 'master' as the name for the initial branch. This default branch name 2025-12-04T09:16:18.3258369Z hint: is subject to change. To configure the initial branch name to use in all 2025-12-04T09:16:18.3258932Z hint: of your new repositories, which will suppress this warning, call: 2025-12-04T09:16:18.3259340Z hint: 2025-12-04T09:16:18.3259653Z hint: git config --global init.defaultBranch 2025-12-04T09:16:18.3260011Z hint: 2025-12-04T09:16:18.3260378Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2025-12-04T09:16:18.3260972Z hint: 'development'. The just-created branch can be renamed via this command: 2025-12-04T09:16:18.3261402Z hint: 2025-12-04T09:16:18.3261633Z hint: git branch -m 2025-12-04T09:16:18.3261914Z hint: 2025-12-04T09:16:18.3262306Z hint: Disable this message with "git config set advice.defaultBranchName false" 2025-12-04T09:16:18.3266591Z Initialized empty Git repository in /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/ 2025-12-04T09:16:18.3278410Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2025-12-04T09:16:18.3328170Z ##[endgroup] 2025-12-04T09:16:18.3328739Z ##[group]Disabling automatic garbage collection 2025-12-04T09:16:18.3331936Z [command]/usr/bin/git config --local gc.auto 0 2025-12-04T09:16:18.3365754Z ##[endgroup] 2025-12-04T09:16:18.3366307Z ##[group]Setting up auth 2025-12-04T09:16:18.3373413Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T09:16:18.3408023Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T09:16:18.3850703Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T09:16:18.3884654Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T09:16:18.4292213Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T09:16:18.4327753Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T09:16:18.4697458Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T09:16:18.4748260Z ##[endgroup] 2025-12-04T09:16:18.4748936Z ##[group]Fetching the repository 2025-12-04T09:16:18.4756595Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-12-04T09:17:10.0463025Z From https://github.com/pytorch/pytorch 2025-12-04T09:17:10.0465078Z * [new branch] 2.6.0.dev20241004+ -> origin/2.6.0.dev20241004+ 2025-12-04T09:17:10.0465680Z * [new branch] 2.9.1 -> origin/2.9.1 2025-12-04T09:17:10.0466263Z * [new branch] AaronWang04_addmmfusion_perftest -> origin/AaronWang04_addmmfusion_perftest 2025-12-04T09:17:10.0466903Z * [new branch] Flamefire-patch-1 -> origin/Flamefire-patch-1 2025-12-04T09:17:10.0469027Z * [new branch] HDCharles-2.6.0-release-notes -> origin/HDCharles-2.6.0-release-notes 2025-12-04T09:17:10.0471039Z * [new branch] HOPrintFunc -> origin/HOPrintFunc 2025-12-04T09:17:10.0474661Z * [new branch] IvanKobzarev/stack/1 -> origin/IvanKobzarev/stack/1 2025-12-04T09:17:10.0477752Z * [new branch] NicoshevSVE128 -> origin/NicoshevSVE128 2025-12-04T09:17:10.0479826Z * [new branch] PR-AOTInductorNoneBug -> origin/PR-AOTInductorNoneBug 2025-12-04T09:17:10.0482150Z * [new branch] PR-AOTInductorNoneBugFix -> origin/PR-AOTInductorNoneBugFix 2025-12-04T09:17:10.0484217Z * [new branch] PR-FixConfigsIssue -> origin/PR-FixConfigsIssue 2025-12-04T09:17:10.0486309Z * [new branch] PR-NoneBugFix-viable -> origin/PR-NoneBugFix-viable 2025-12-04T09:17:10.0488456Z * [new branch] PR-ResetToZero -> origin/PR-ResetToZero 2025-12-04T09:17:10.0490711Z * [new branch] Update-Flash-Packaging -> origin/Update-Flash-Packaging 2025-12-04T09:17:10.0492933Z * [new branch] VLA_exp -> origin/VLA_exp 2025-12-04T09:17:10.0495983Z * [new branch] activation_bench -> origin/activation_bench 2025-12-04T09:17:10.0498139Z * [new branch] addmm-heuristic -> origin/addmm-heuristic 2025-12-04T09:17:10.0500978Z * [new branch] adi/onednn_aarch64 -> origin/adi/onednn_aarch64 2025-12-04T09:17:10.0503353Z * [new branch] adi/test -> origin/adi/test 2025-12-04T09:17:10.0505526Z * [new branch] adi/test_bgemm -> origin/adi/test_bgemm 2025-12-04T09:17:10.0508009Z * [new branch] adi/test_m8g -> origin/adi/test_m8g 2025-12-04T09:17:10.0510235Z * [new branch] adi/test_onednn -> origin/adi/test_onednn 2025-12-04T09:17:10.0512229Z * [new branch] adi/test_onednn_v3.9 -> origin/adi/test_onednn_v3.9 2025-12-04T09:17:10.0514339Z * [new branch] adi/test_presve_change -> origin/adi/test_presve_change 2025-12-04T09:17:10.0516387Z * [new branch] adi/test_timm -> origin/adi/test_timm 2025-12-04T09:17:10.0519071Z * [new branch] adi/testpresve_change -> origin/adi/testpresve_change 2025-12-04T09:17:10.0522631Z * [new branch] aditew01/test/vec_bf16 -> origin/aditew01/test/vec_bf16 2025-12-04T09:17:10.0524890Z * [new branch] ah-globalfeedback-hook -> origin/ah-globalfeedback-hook 2025-12-04T09:17:10.0528039Z * [new branch] albanD-patch-1 -> origin/albanD-patch-1 2025-12-04T09:17:10.0529840Z * [new branch] also-surround-shimh -> origin/also-surround-shimh 2025-12-04T09:17:10.0533253Z * [new branch] angelayi/aot_compile -> origin/angelayi/aot_compile 2025-12-04T09:17:10.0535593Z * [new branch] angelayi/aoti_additional_files -> origin/angelayi/aoti_additional_files 2025-12-04T09:17:10.0537429Z * [new branch] angelayi/benchmark -> origin/angelayi/benchmark 2025-12-04T09:17:10.0539675Z * [new branch] angelayi/change_pytree_serialization -> origin/angelayi/change_pytree_serialization 2025-12-04T09:17:10.0541691Z * [new branch] angelayi/cpp_loader -> origin/angelayi/cpp_loader 2025-12-04T09:17:10.0544056Z * [new branch] angelayi/inductor_const -> origin/angelayi/inductor_const 2025-12-04T09:17:10.0546111Z * [new branch] angelayi/lstm -> origin/angelayi/lstm 2025-12-04T09:17:10.0548851Z * [new branch] angelayi/no_so_weight -> origin/angelayi/no_so_weight 2025-12-04T09:17:10.0551607Z * [new branch] angelayi/scan_layers -> origin/angelayi/scan_layers 2025-12-04T09:17:10.0553844Z * [new branch] angelayi/side_eff -> origin/angelayi/side_eff 2025-12-04T09:17:10.0556137Z * [new branch] angelayi/state_dict -> origin/angelayi/state_dict 2025-12-04T09:17:10.0558433Z * [new branch] angelayi/symint_input -> origin/angelayi/symint_input 2025-12-04T09:17:10.0560828Z * [new branch] angelayi/symm_mem -> origin/angelayi/symm_mem 2025-12-04T09:17:10.0562925Z * [new branch] angelayi/test_cpp -> origin/angelayi/test_cpp 2025-12-04T09:17:10.0565130Z * [new branch] angelayi/torch_size -> origin/angelayi/torch_size 2025-12-04T09:17:10.0567375Z * [new branch] annotate_assert -> origin/annotate_assert 2025-12-04T09:17:10.0569573Z * [new branch] annotate_fallback_kernel -> origin/annotate_fallback_kernel 2025-12-04T09:17:10.0571761Z * [new branch] annotation_deepcopy -> origin/annotation_deepcopy 2025-12-04T09:17:10.0573966Z * [new branch] annotation_dynamo -> origin/annotation_dynamo 2025-12-04T09:17:10.0576154Z * [new branch] aot_eager_stack_trace -> origin/aot_eager_stack_trace 2025-12-04T09:17:10.0578321Z * [new branch] aoti-cuda-alloc -> origin/aoti-cuda-alloc 2025-12-04T09:17:10.0580450Z * [new branch] aoti_const_device -> origin/aoti_const_device 2025-12-04T09:17:10.0582705Z * [new branch] aoti_fqn_name_interface -> origin/aoti_fqn_name_interface 2025-12-04T09:17:10.0584958Z * [new branch] aoti_package_weights_binary -> origin/aoti_package_weights_binary 2025-12-04T09:17:10.0587074Z * [new branch] aoti_target_windows -> origin/aoti_target_windows 2025-12-04T09:17:10.0590657Z * [new branch] arsh/feat/inductor_check_profiling -> origin/arsh/feat/inductor_check_profiling 2025-12-04T09:17:10.0592719Z * [new branch] async_tp -> origin/async_tp 2025-12-04T09:17:10.0595313Z * [new branch] atalman-inductor-perf-cu124 -> origin/atalman-inductor-perf-cu124 2025-12-04T09:17:10.0597425Z * [new branch] atalman-inductor-perf-cu124.1 -> origin/atalman-inductor-perf-cu124.1 2025-12-04T09:17:10.0599632Z * [new branch] atalman-patch-2 -> origin/atalman-patch-2 2025-12-04T09:17:10.0601888Z * [new branch] atalman-patch-3 -> origin/atalman-patch-3 2025-12-04T09:17:10.0604155Z * [new branch] atalman-patch-4 -> origin/atalman-patch-4 2025-12-04T09:17:10.0606536Z * [new branch] atalman-patch-5 -> origin/atalman-patch-5 2025-12-04T09:17:10.0608752Z * [new branch] atalman-patch-6 -> origin/atalman-patch-6 2025-12-04T09:17:10.0611096Z * [new branch] atalman-patch-7 -> origin/atalman-patch-7 2025-12-04T09:17:10.0613368Z * [new branch] atalman-patch-8 -> origin/atalman-patch-8 2025-12-04T09:17:10.0615570Z * [new branch] atalman_inductor_2.3.1 -> origin/atalman_inductor_2.3.1 2025-12-04T09:17:10.0617775Z * [new branch] atalman_inductor_2.4.0 -> origin/atalman_inductor_2.4.0 2025-12-04T09:17:10.0620006Z * [new branch] atalman_inductor_2.4.x -> origin/atalman_inductor_2.4.x 2025-12-04T09:17:10.0622428Z * [new branch] attention_benchmarking_clean -> origin/attention_benchmarking_clean 2025-12-04T09:17:10.0625610Z * [new branch] bahuang/dt_fix_scalar_add -> origin/bahuang/dt_fix_scalar_add 2025-12-04T09:17:10.0627576Z * [new branch] bahuang/fix_debug_mode -> origin/bahuang/fix_debug_mode 2025-12-04T09:17:10.0629691Z * [new branch] bahuang/fix_expand -> origin/bahuang/fix_expand 2025-12-04T09:17:10.0631795Z * [new branch] bahuang/test -> origin/bahuang/test 2025-12-04T09:17:10.0634851Z * [new branch] base/1.5 -> origin/base/1.5 2025-12-04T09:17:10.0637281Z * [new branch] batching_sdpa_efficient_attention -> origin/batching_sdpa_efficient_attention 2025-12-04T09:17:10.0639416Z * [new branch] bench_scaled_mm_ops -> origin/bench_scaled_mm_ops 2025-12-04T09:17:10.0641788Z * [new branch] benchmark-updates -> origin/benchmark-updates 2025-12-04T09:17:10.0643953Z * [new branch] benchmarking-script -> origin/benchmarking-script 2025-12-04T09:17:10.0646814Z * [new branch] bertmaher/pinbump26 -> origin/bertmaher/pinbump26 2025-12-04T09:17:10.0649997Z * [new branch] bertrand/cutlass -> origin/bertrand/cutlass 2025-12-04T09:17:10.0652648Z * [new branch] bf/bug-static-input -> origin/bf/bug-static-input 2025-12-04T09:17:10.0654619Z * [new branch] bf/cg-backend -> origin/bf/cg-backend 2025-12-04T09:17:10.0656698Z * [new branch] bf/cg-nccl-test -> origin/bf/cg-nccl-test 2025-12-04T09:17:10.0658731Z * [new branch] bf/cg-remove-check -> origin/bf/cg-remove-check 2025-12-04T09:17:10.0660934Z * [new branch] bf/clean-torchbench-hf -> origin/bf/clean-torchbench-hf 2025-12-04T09:17:10.0676107Z * [new branch] bf/combo-debug-log -> origin/bf/combo-debug-log 2025-12-04T09:17:10.0676658Z * [new branch] bf/cudagraph -> origin/bf/cudagraph 2025-12-04T09:17:10.0677295Z * [new branch] bf/cudagraph-disable-input-mutation -> origin/bf/cudagraph-disable-input-mutation 2025-12-04T09:17:10.0679182Z * [new branch] bf/cudagraph-enable-input-mutation-support-benchmark -> origin/bf/cudagraph-enable-input-mutation-support-benchmark 2025-12-04T09:17:10.0680087Z * [new branch] bf/cudagraph-partition -> origin/bf/cudagraph-partition 2025-12-04T09:17:10.0680674Z * [new branch] bf/donated-buffer-bench -> origin/bf/donated-buffer-bench 2025-12-04T09:17:10.0681244Z * [new branch] bf/dynamo-partition -> origin/bf/dynamo-partition 2025-12-04T09:17:10.0681737Z * [new branch] bf/lite -> origin/bf/lite 2025-12-04T09:17:10.0682242Z * [new branch] bf/pa-non-divisible -> origin/bf/pa-non-divisible 2025-12-04T09:17:10.0682865Z * [new branch] bf/partition-cache-free-symbols -> origin/bf/partition-cache-free-symbols 2025-12-04T09:17:10.0683557Z * [new branch] bf/partition-memory-plan -> origin/bf/partition-memory-plan 2025-12-04T09:17:10.0686219Z * [new branch] bf/partition-move-cpu -> origin/bf/partition-move-cpu 2025-12-04T09:17:10.0688395Z * [new branch] bf/partition-view-fallback -> origin/bf/partition-view-fallback 2025-12-04T09:17:10.0690524Z * [new branch] bf/remove-check-55b0c39d -> origin/bf/remove-check-55b0c39d 2025-12-04T09:17:10.0692697Z * [new branch] bf/timm-nov-26-2025 -> origin/bf/timm-nov-26-2025 2025-12-04T09:17:10.0694840Z * [new branch] bf/transformer-pin-4-57-3 -> origin/bf/transformer-pin-4-57-3 2025-12-04T09:17:10.0696964Z * [new branch] bisect_perf_hf_T5_3acc6eac492 -> origin/bisect_perf_hf_T5_3acc6eac492 2025-12-04T09:17:10.0699090Z * [new branch] bisect_perf_hf_T5_3fcf66f61fb -> origin/bisect_perf_hf_T5_3fcf66f61fb 2025-12-04T09:17:10.0701197Z * [new branch] bisect_perf_hf_T5_4009d154129 -> origin/bisect_perf_hf_T5_4009d154129 2025-12-04T09:17:10.0703441Z * [new branch] bisect_perf_hf_T5_40d0740e73d -> origin/bisect_perf_hf_T5_40d0740e73d 2025-12-04T09:17:10.0705491Z * [new branch] bisect_perf_hf_T5_5268754e -> origin/bisect_perf_hf_T5_5268754e 2025-12-04T09:17:10.0707624Z * [new branch] bisect_perf_hf_T5_7d89a8d385c -> origin/bisect_perf_hf_T5_7d89a8d385c 2025-12-04T09:17:10.0709709Z * [new branch] bisect_perf_hf_T5_b7a25c1ee7c -> origin/bisect_perf_hf_T5_b7a25c1ee7c 2025-12-04T09:17:10.0711829Z * [new branch] bisect_perf_hf_T5_c25b201583f -> origin/bisect_perf_hf_T5_c25b201583f 2025-12-04T09:17:10.0713988Z * [new branch] bisect_perf_hf_T5_c93e57efac0 -> origin/bisect_perf_hf_T5_c93e57efac0 2025-12-04T09:17:10.0716410Z * [new branch] bisect_perf_hf_T5_ca9813ea149 -> origin/bisect_perf_hf_T5_ca9813ea149 2025-12-04T09:17:10.0718467Z * [new branch] bisect_perf_hf_T5_d65f194a -> origin/bisect_perf_hf_T5_d65f194a 2025-12-04T09:17:10.0720513Z * [new branch] bisect_perf_hf_T5_da94ab0b -> origin/bisect_perf_hf_T5_da94ab0b 2025-12-04T09:17:10.0722748Z * [new branch] bisect_perf_hf_T5_da94ab0b_new -> origin/bisect_perf_hf_T5_da94ab0b_new 2025-12-04T09:17:10.0725418Z * [new branch] bisect_perf_hf_T5_db4e8a1d8a8 -> origin/bisect_perf_hf_T5_db4e8a1d8a8 2025-12-04T09:17:10.0727288Z * [new branch] bisect_perf_hf_T5_e0d97e936a2 -> origin/bisect_perf_hf_T5_e0d97e936a2 2025-12-04T09:17:10.0729396Z * [new branch] bisect_perf_hf_T5_f23621ec563 -> origin/bisect_perf_hf_T5_f23621ec563 2025-12-04T09:17:10.0732329Z * [new branch] brister/fx_device_type -> origin/brister/fx_device_type 2025-12-04T09:17:10.0734447Z * [new branch] brister/test_inductor_all_fx -> origin/brister/test_inductor_all_fx 2025-12-04T09:17:10.0736551Z * [new branch] brister/tiled_reduction_no_numel_check -> origin/brister/tiled_reduction_no_numel_check 2025-12-04T09:17:10.0738637Z * [new branch] bwd-backup -> origin/bwd-backup 2025-12-04T09:17:10.0740832Z * [new branch] c57382a49 -> origin/c57382a49 2025-12-04T09:17:10.0742997Z * [new branch] ca_0431d47eaa -> origin/ca_0431d47eaa 2025-12-04T09:17:10.0745290Z * [new branch] ca_fix_0431d47eaa -> origin/ca_fix_0431d47eaa 2025-12-04T09:17:10.0748263Z * [new branch] camyllh/test_setup_hooks_push -> origin/camyllh/test_setup_hooks_push 2025-12-04T09:17:10.0750533Z * [new branch] cccclai-patch-1 -> origin/cccclai-patch-1 2025-12-04T09:17:10.0752864Z * [new branch] cherry-pick-159969-by-pytorch_bot_bot_ -> origin/cherry-pick-159969-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0755623Z * [new branch] cherry-pick-160586-by-pytorch_bot_bot_ -> origin/cherry-pick-160586-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0757918Z * [new branch] cherry-pick-162208-by-pytorch_bot_bot_ -> origin/cherry-pick-162208-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0760108Z * [new branch] cherry-pick-163169-by-pytorch_bot_bot_ -> origin/cherry-pick-163169-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0762323Z * [new branch] cherry-pick-165086-by-pytorch_bot_bot_ -> origin/cherry-pick-165086-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0764690Z * [new branch] cherry-pick-165514-by-pytorch_bot_bot_ -> origin/cherry-pick-165514-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0766903Z * [new branch] cherry-pick-165601-by-pytorch_bot_bot_ -> origin/cherry-pick-165601-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0769140Z * [new branch] cherry-pick-165667-by-pytorch_bot_bot_ -> origin/cherry-pick-165667-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0771390Z * [new branch] cherry-pick-165815-by-pytorch_bot_bot_ -> origin/cherry-pick-165815-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0773635Z * [new branch] cherry-pick-165922-by-pytorch_bot_bot_ -> origin/cherry-pick-165922-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0775826Z * [new branch] cherry-pick-166148-by-pytorch_bot_bot_ -> origin/cherry-pick-166148-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0778007Z * [new branch] cherry-pick-166181-by-pytorch_bot_bot_ -> origin/cherry-pick-166181-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0780172Z * [new branch] cherry-pick-166404-by-pytorch_bot_bot_ -> origin/cherry-pick-166404-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0782416Z * [new branch] cherry-pick-166427-by-pytorch_bot_bot_ -> origin/cherry-pick-166427-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0785126Z * [new branch] cherry-pick-166480-by-pytorch_bot_bot_ -> origin/cherry-pick-166480-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0787065Z * [new branch] cherry-pick-166570-by-pytorch_bot_bot_ -> origin/cherry-pick-166570-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0789219Z * [new branch] cherry-pick-166993-by-pytorch_bot_bot_ -> origin/cherry-pick-166993-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0791498Z * [new branch] cherry-pick-167111-by-pytorch_bot_bot_ -> origin/cherry-pick-167111-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0793755Z * [new branch] cherry-pick-167478-by-pytorch_bot_bot_ -> origin/cherry-pick-167478-by-pytorch_bot_bot_ 2025-12-04T09:17:10.0795780Z * [new branch] cherry_pick_166036_166040 -> origin/cherry_pick_166036_166040 2025-12-04T09:17:10.0798012Z * [new branch] cherry_pick_166457 -> origin/cherry_pick_166457 2025-12-04T09:17:10.0800240Z * [new branch] cherrypick_166338 -> origin/cherrypick_166338 2025-12-04T09:17:10.0802469Z * [new branch] cherrypick_166458 -> origin/cherrypick_166458 2025-12-04T09:17:10.0804583Z * [new branch] cherrypick_166586 -> origin/cherrypick_166586 2025-12-04T09:17:10.0806768Z * [new branch] cherrypick_166956 -> origin/cherrypick_166956 2025-12-04T09:17:10.0808948Z * [new branch] ci_attn -> origin/ci_attn 2025-12-04T09:17:10.0811094Z * [new branch] codex-testing -> origin/codex-testing 2025-12-04T09:17:10.0814277Z * [new branch] codex/add-check_memory_overlap-helper-functions -> origin/codex/add-check_memory_overlap-helper-functions 2025-12-04T09:17:10.0816218Z * [new branch] codex/fix-issue-121219-in-pytorch -> origin/codex/fix-issue-121219-in-pytorch 2025-12-04T09:17:10.0818910Z * [new branch] codex/investigate-segfaults-in-get_tensor_storage_id -> origin/codex/investigate-segfaults-in-get_tensor_storage_id 2025-12-04T09:17:10.0821298Z * [new branch] codex/refactor-lintrunner-config-to-use-uv-run -> origin/codex/refactor-lintrunner-config-to-use-uv-run 2025-12-04T09:17:10.0823239Z * [new branch] compatiblpy39util -> origin/compatiblpy39util 2025-12-04T09:17:10.0825499Z * [new branch] cond_hop_device -> origin/cond_hop_device 2025-12-04T09:17:10.0827767Z * [new branch] context_test -> origin/context_test 2025-12-04T09:17:10.0830746Z * [new branch] copilot/code-style-cleanup-python-pip -> origin/copilot/code-style-cleanup-python-pip 2025-12-04T09:17:10.0833412Z * [new branch] cpio/fix_new_ami_tests -> origin/cpio/fix_new_ami_tests 2025-12-04T09:17:10.0835726Z * [new branch] cpp-docs-dependency-upgrade -> origin/cpp-docs-dependency-upgrade 2025-12-04T09:17:10.0838585Z * [new branch] crpa/typo-in-inductor_comm_lowering -> origin/crpa/typo-in-inductor_comm_lowering 2025-12-04T09:17:10.0841827Z * [new branch] csl/always_produce_xml -> origin/csl/always_produce_xml 2025-12-04T09:17:10.0843582Z * [new branch] csl/build_test_more_procs -> origin/csl/build_test_more_procs 2025-12-04T09:17:10.0845757Z * [new branch] csl/build_test_more_procs2 -> origin/csl/build_test_more_procs2 2025-12-04T09:17:10.0848244Z * [new branch] csl/clean_up -> origin/csl/clean_up 2025-12-04T09:17:10.0850451Z * [new branch] csl/fix_retry_segfault_exit -> origin/csl/fix_retry_segfault_exit 2025-12-04T09:17:10.0852451Z * [new branch] csl/katex -> origin/csl/katex 2025-12-04T09:17:10.0854887Z * [new branch] csl/larger_runner -> origin/csl/larger_runner 2025-12-04T09:17:10.0857458Z * [new branch] csl/lint_testing -> origin/csl/lint_testing 2025-12-04T09:17:10.0859967Z * [new branch] csl/lint_thing -> origin/csl/lint_thing 2025-12-04T09:17:10.0862451Z * [new branch] csl/lintrunner_stuff -> origin/csl/lintrunner_stuff 2025-12-04T09:17:10.0864609Z * [new branch] csl/manually_gen_json -> origin/csl/manually_gen_json 2025-12-04T09:17:10.0866799Z * [new branch] csl/mps_sharding -> origin/csl/mps_sharding 2025-12-04T09:17:10.0869109Z * [new branch] csl/multistage_docker -> origin/csl/multistage_docker 2025-12-04T09:17:10.0871301Z * [new branch] csl/print_timing -> origin/csl/print_timing 2025-12-04T09:17:10.0873455Z * [new branch] csl/remove_experiment -> origin/csl/remove_experiment 2025-12-04T09:17:10.0875678Z * [new branch] csl/remove_maybe_unused_var -> origin/csl/remove_maybe_unused_var 2025-12-04T09:17:10.0877990Z * [new branch] csl/remove_repo_specific_autolabel -> origin/csl/remove_repo_specific_autolabel 2025-12-04T09:17:10.0880163Z * [new branch] csl/remove_run_parallel -> origin/csl/remove_run_parallel 2025-12-04T09:17:10.0882282Z * [new branch] csl/remove_unused_vars -> origin/csl/remove_unused_vars 2025-12-04T09:17:10.0884405Z * [new branch] csl/revert_open -> origin/csl/revert_open 2025-12-04T09:17:10.0886659Z * [new branch] csl/skip_build -> origin/csl/skip_build 2025-12-04T09:17:10.0888869Z * [new branch] csl/smaller_avx_amx_runenrs -> origin/csl/smaller_avx_amx_runenrs 2025-12-04T09:17:10.0890940Z * [new branch] csl/td_job_level -> origin/csl/td_job_level 2025-12-04T09:17:10.0893168Z * [new branch] csl/test_cuda_build_large_runner -> origin/csl/test_cuda_build_large_runner 2025-12-04T09:17:10.0895436Z * [new branch] csl/test_owners_autograd_dispatch_nn -> origin/csl/test_owners_autograd_dispatch_nn 2025-12-04T09:17:10.0897576Z * [new branch] csl/test_owners_higher_confidence -> origin/csl/test_owners_higher_confidence 2025-12-04T09:17:10.0899701Z * [new branch] csl/upload_json_running -> origin/csl/upload_json_running 2025-12-04T09:17:10.0901899Z * [new branch] csl/win_sccache -> origin/csl/win_sccache 2025-12-04T09:17:10.0904184Z * [new branch] csl/xml_stuff -> origin/csl/xml_stuff 2025-12-04T09:17:10.0906354Z * [new branch] cublasrelax2 -> origin/cublasrelax2 2025-12-04T09:17:10.0908509Z * [new branch] cuda_mempool -> origin/cuda_mempool 2025-12-04T09:17:10.0910740Z * [new branch] custom_lowering_dict -> origin/custom_lowering_dict 2025-12-04T09:17:10.0913571Z * [new branch] d4l3k/debug_plane_frtrace -> origin/d4l3k/debug_plane_frtrace 2025-12-04T09:17:10.0916397Z * [new branch] daxia6/2.8o3 -> origin/daxia6/2.8o3 2025-12-04T09:17:10.0918533Z * [new branch] debug-guard -> origin/debug-guard 2025-12-04T09:17:10.0920792Z * [new branch] delete-quant-docs -> origin/delete-quant-docs 2025-12-04T09:17:10.0927169Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 2025-12-04T09:17:10.0929500Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1 2025-12-04T09:17:10.0932041Z * [new branch] desertfire/test_cpp_wrapper -> origin/desertfire/test_cpp_wrapper 2025-12-04T09:17:10.0934159Z * [new branch] desertfire/triton-cpu-for-aarch64 -> origin/desertfire/triton-cpu-for-aarch64 2025-12-04T09:17:10.0937344Z * [new branch] dev/dhruva/flex_attn_opt -> origin/dev/dhruva/flex_attn_opt 2025-12-04T09:17:10.0940472Z * [new branch] dev/joona/MPSNDArrayAdd -> origin/dev/joona/MPSNDArrayAdd 2025-12-04T09:17:10.0942999Z * [new branch] dev/joona/Unranked -> origin/dev/joona/Unranked 2025-12-04T09:17:10.0945417Z * [new branch] dev/joona/cat -> origin/dev/joona/cat 2025-12-04T09:17:10.0947529Z * [new branch] dev/joona/embeddingbag -> origin/dev/joona/embeddingbag 2025-12-04T09:17:10.0949766Z * [new branch] dev/joona/fix_sdpa_memtest -> origin/dev/joona/fix_sdpa_memtest 2025-12-04T09:17:10.0952178Z * [new branch] dev/joona/getTensorsString -> origin/dev/joona/getTensorsString 2025-12-04T09:17:10.0954539Z * [new branch] dev/joona/mps_linear_macos14 -> origin/dev/joona/mps_linear_macos14 2025-12-04T09:17:10.0957232Z * [new branch] dev/joona/scalar_clamp -> origin/dev/joona/scalar_clamp 2025-12-04T09:17:10.0959925Z * [new branch] dev/joona/sdpa -> origin/dev/joona/sdpa 2025-12-04T09:17:10.0962821Z * [new branch] dev/joona/sdpa_api -> origin/dev/joona/sdpa_api 2025-12-04T09:17:10.0965187Z * [new branch] dev/joona/type_inf -> origin/dev/joona/type_inf 2025-12-04T09:17:10.0967677Z * [new branch] dev/joona/ulpAssertClose -> origin/dev/joona/ulpAssertClose 2025-12-04T09:17:10.0969998Z * [new branch] dev/joona/upsize3d -> origin/dev/joona/upsize3d 2025-12-04T09:17:10.0972076Z * [new branch] disp_counter -> origin/disp_counter 2025-12-04T09:17:10.0974354Z * [new branch] divyanshk-patch-1 -> origin/divyanshk-patch-1 2025-12-04T09:17:10.0976382Z * [new branch] docs -> origin/docs 2025-12-04T09:17:10.0978596Z * [new branch] documentation -> origin/documentation 2025-12-04T09:17:10.0980732Z * [new branch] eager_model_benchmarks -> origin/eager_model_benchmarks 2025-12-04T09:17:10.0983719Z * [new branch] embg/test_inductor_ci_control -> origin/embg/test_inductor_ci_control 2025-12-04T09:17:10.0985733Z * [new branch] embg/triton_l2_prefetch_128B -> origin/embg/triton_l2_prefetch_128B 2025-12-04T09:17:10.0987761Z * [new branch] embg/triton_l2_prefetch_256B -> origin/embg/triton_l2_prefetch_256B 2025-12-04T09:17:10.0990095Z * [new branch] eqy-patch-1 -> origin/eqy-patch-1 2025-12-04T09:17:10.0992249Z * [new branch] eqy-patch-2 -> origin/eqy-patch-2 2025-12-04T09:17:10.0994470Z * [new branch] eqy-patch-3 -> origin/eqy-patch-3 2025-12-04T09:17:10.0996696Z * [new branch] eqy-patch-4 -> origin/eqy-patch-4 2025-12-04T09:17:10.0998866Z * [new branch] eqy-patch-5 -> origin/eqy-patch-5 2025-12-04T09:17:10.1001008Z * [new branch] eqy-patch-6 -> origin/eqy-patch-6 2025-12-04T09:17:10.1003906Z * [new branch] exclamaforte/amd-ma -> origin/exclamaforte/amd-ma 2025-12-04T09:17:10.1006159Z * [new branch] exclamaforte/combo-kernels-perf-run -> origin/exclamaforte/combo-kernels-perf-run 2025-12-04T09:17:10.1008145Z * [new branch] exclamaforte/do_bench_refactor -> origin/exclamaforte/do_bench_refactor 2025-12-04T09:17:10.1010261Z * [new branch] exclamaforte/enable-mem-dep-fusion -> origin/exclamaforte/enable-mem-dep-fusion 2025-12-04T09:17:10.1012471Z * [new branch] exclamaforte/fix-exhaustive-autotuning -> origin/exclamaforte/fix-exhaustive-autotuning 2025-12-04T09:17:10.1014868Z * [new branch] exclamaforte/fix-trace-parsing-fx-svg -> origin/exclamaforte/fix-trace-parsing-fx-svg 2025-12-04T09:17:10.1017374Z * [new branch] exclamaforte/force-pointwise-cat-perf-run -> origin/exclamaforte/force-pointwise-cat-perf-run 2025-12-04T09:17:10.1019399Z * [new branch] exclamaforte/fusion-data -> origin/exclamaforte/fusion-data 2025-12-04T09:17:10.1021811Z * [new branch] exclamaforte/gemm-benchmark-run -> origin/exclamaforte/gemm-benchmark-run 2025-12-04T09:17:10.1023899Z * [new branch] exclamaforte/gemm-export-model -> origin/exclamaforte/gemm-export-model 2025-12-04T09:17:10.1026475Z * [new branch] exclamaforte/gemm-model -> origin/exclamaforte/gemm-model 2025-12-04T09:17:10.1028775Z * [new branch] exclamaforte/gemm-model-all-data-collection -> origin/exclamaforte/gemm-model-all-data-collection 2025-12-04T09:17:10.1030752Z * [new branch] exclamaforte/gemm-to-amd -> origin/exclamaforte/gemm-to-amd 2025-12-04T09:17:10.1032958Z * [new branch] exclamaforte/just-gemm-model -> origin/exclamaforte/just-gemm-model 2025-12-04T09:17:10.1035426Z * [new branch] exclamaforte/just-gemm-model-no-refactor -> origin/exclamaforte/just-gemm-model-no-refactor 2025-12-04T09:17:10.1037657Z * [new branch] exclamaforte/profile-diff-algo -> origin/exclamaforte/profile-diff-algo 2025-12-04T09:17:10.1039841Z * [new branch] exclamaforte/profiler-visualization -> origin/exclamaforte/profiler-visualization 2025-12-04T09:17:10.1042085Z * [new branch] exclamaforte/test_cpp_wrapper_mode -> origin/exclamaforte/test_cpp_wrapper_mode 2025-12-04T09:17:10.1044345Z * [new branch] exclamaforte/update-autotune-configs -> origin/exclamaforte/update-autotune-configs 2025-12-04T09:17:10.1046667Z * [new branch] exclamaforte/update-autotune-configs-2 -> origin/exclamaforte/update-autotune-configs-2 2025-12-04T09:17:10.1048720Z * [new branch] exec -> origin/exec 2025-12-04T09:17:10.1051123Z * [new branch] experimental-mosaic -> origin/experimental-mosaic 2025-12-04T09:17:10.1053428Z * [new branch] export-D61047529 -> origin/export-D61047529 2025-12-04T09:17:10.1055663Z * [new branch] export-D71412006 -> origin/export-D71412006 2025-12-04T09:17:10.1058093Z * [new branch] export-D73042989 -> origin/export-D73042989 2025-12-04T09:17:10.1060267Z * [new branch] export-D78957093 -> origin/export-D78957093 2025-12-04T09:17:10.1062403Z * [new branch] export-D78996107 -> origin/export-D78996107 2025-12-04T09:17:10.1064806Z * [new branch] export-D80823877 -> origin/export-D80823877 2025-12-04T09:17:10.1067158Z * [new branch] export-D80958642 -> origin/export-D80958642 2025-12-04T09:17:10.1069404Z * [new branch] export-D81054193 -> origin/export-D81054193 2025-12-04T09:17:10.1071550Z * [new branch] export-D81204584 -> origin/export-D81204584 2025-12-04T09:17:10.1073724Z * [new branch] export-D81429090 -> origin/export-D81429090 2025-12-04T09:17:10.1076078Z * [new branch] export-D82250826 -> origin/export-D82250826 2025-12-04T09:17:10.1078346Z * [new branch] export-D82253817 -> origin/export-D82253817 2025-12-04T09:17:10.1080518Z * [new branch] export-D83541846 -> origin/export-D83541846 2025-12-04T09:17:10.1082794Z * [new branch] export-D83627170 -> origin/export-D83627170 2025-12-04T09:17:10.1085496Z * [new branch] export-D83766701 -> origin/export-D83766701 2025-12-04T09:17:10.1087677Z * [new branch] export-D83768878 -> origin/export-D83768878 2025-12-04T09:17:10.1089920Z * [new branch] export-D83769447 -> origin/export-D83769447 2025-12-04T09:17:10.1092109Z * [new branch] export-D84089824 -> origin/export-D84089824 2025-12-04T09:17:10.1094312Z * [new branch] export-D84213020 -> origin/export-D84213020 2025-12-04T09:17:10.1097091Z * [new branch] export-D84373821 -> origin/export-D84373821 2025-12-04T09:17:10.1099527Z * [new branch] export-D84612194 -> origin/export-D84612194 2025-12-04T09:17:10.1101639Z * [new branch] export-D84890985 -> origin/export-D84890985 2025-12-04T09:17:10.1103927Z * [new branch] export-D85122326 -> origin/export-D85122326 2025-12-04T09:17:10.1106279Z * [new branch] export-D86256198 -> origin/export-D86256198 2025-12-04T09:17:10.1108468Z * [new branch] export-D86460608 -> origin/export-D86460608 2025-12-04T09:17:10.1110777Z * [new branch] export-D86474796 -> origin/export-D86474796 2025-12-04T09:17:10.1113204Z * [new branch] export-D86712396 -> origin/export-D86712396 2025-12-04T09:17:10.1115399Z * [new branch] export-D87022129 -> origin/export-D87022129 2025-12-04T09:17:10.1117685Z * [new branch] export-D87838959 -> origin/export-D87838959 2025-12-04T09:17:10.1120003Z * [new branch] export-D88319437 -> origin/export-D88319437 2025-12-04T09:17:10.1122441Z * [new branch] exported-model-train-idempotent -> origin/exported-model-train-idempotent 2025-12-04T09:17:10.1124871Z * [new branch] ezyang-titan-october -> origin/ezyang-titan-october 2025-12-04T09:17:10.1127760Z * [new branch] ezyang-titan-october2 -> origin/ezyang-titan-october2 2025-12-04T09:17:10.1129851Z * [new branch] ezyang-war -> origin/ezyang-war 2025-12-04T09:17:10.1132755Z * [new branch] ezyang/wip-aot-descriptors -> origin/ezyang/wip-aot-descriptors 2025-12-04T09:17:10.1134854Z * [new branch] fa_u8_brgemm -> origin/fa_u8_brgemm 2025-12-04T09:17:10.1137855Z * [new branch] fadeputr/sequence_fbgemm -> origin/fadeputr/sequence_fbgemm 2025-12-04T09:17:10.1140027Z * [new branch] fastmath_baseline -> origin/fastmath_baseline 2025-12-04T09:17:10.1143160Z * [new branch] fbcode/warm -> origin/fbcode/warm 2025-12-04T09:17:10.1145454Z * [new branch] fca -> origin/fca 2025-12-04T09:17:10.1147659Z * [new branch] fca2_ca5984c -> origin/fca2_ca5984c 2025-12-04T09:17:10.1149905Z * [new branch] fca5 -> origin/fca5 2025-12-04T09:17:10.1152777Z * [new branch] feature/justknobs-cpp -> origin/feature/justknobs-cpp 2025-12-04T09:17:10.1154943Z * [new branch] feature/numa-forkserver -> origin/feature/numa-forkserver 2025-12-04T09:17:10.1157616Z * [new branch] ffast_math_baseline -> origin/ffast_math_baseline 2025-12-04T09:17:10.1159819Z * [new branch] ffast_math_target -> origin/ffast_math_target 2025-12-04T09:17:10.1162658Z * [new branch] findhao/base_commit -> origin/findhao/base_commit 2025-12-04T09:17:10.1164849Z * [new branch] findhao/base_commit1 -> origin/findhao/base_commit1 2025-12-04T09:17:10.1166989Z * [new branch] findhao/multistream2 -> origin/findhao/multistream2 2025-12-04T09:17:10.1169053Z * [new branch] findhao/multistream5 -> origin/findhao/multistream5 2025-12-04T09:17:10.1171146Z * [new branch] findhao/multistream6 -> origin/findhao/multistream6 2025-12-04T09:17:10.1173242Z * [new branch] findhao/operatorbench3 -> origin/findhao/operatorbench3 2025-12-04T09:17:10.1175371Z * [new branch] findhao/operatorbench5 -> origin/findhao/operatorbench5 2025-12-04T09:17:10.1177401Z * [new branch] findhao/tritonparse -> origin/findhao/tritonparse 2025-12-04T09:17:10.1179709Z * [new branch] fix-ck-gemm-template-format -> origin/fix-ck-gemm-template-format 2025-12-04T09:17:10.1182118Z * [new branch] fix-config-ignore -> origin/fix-config-ignore 2025-12-04T09:17:10.1184649Z * [new branch] fix-dict-guard -> origin/fix-dict-guard 2025-12-04T09:17:10.1186752Z * [new branch] fix_addmm_issue -> origin/fix_addmm_issue 2025-12-04T09:17:10.1189041Z * [new branch] fix_amd_missing_cluster_dims -> origin/fix_amd_missing_cluster_dims 2025-12-04T09:17:10.1191239Z * [new branch] fix_bench_bwd_pass -> origin/fix_bench_bwd_pass 2025-12-04T09:17:10.1193468Z * [new branch] fix_mem_profiler_config -> origin/fix_mem_profiler_config 2025-12-04T09:17:10.1195596Z * [new branch] fix_nvrtc_discovery -> origin/fix_nvrtc_discovery 2025-12-04T09:17:10.1197802Z * [new branch] fix_op_runner -> origin/fix_op_runner 2025-12-04T09:17:10.1200071Z * [new branch] fix_ubn_159469 -> origin/fix_ubn_159469 2025-12-04T09:17:10.1202295Z * [new branch] fixes-triage -> origin/fixes-triage 2025-12-04T09:17:10.1204517Z * [new branch] fixflashinfer -> origin/fixflashinfer 2025-12-04T09:17:10.1207021Z * [new branch] flash_decoding_cpu -> origin/flash_decoding_cpu 2025-12-04T09:17:10.1209199Z * [new branch] flex-flash -> origin/flex-flash 2025-12-04T09:17:10.1211474Z * [new branch] flex_attention_functorch_grad -> origin/flex_attention_functorch_grad 2025-12-04T09:17:10.1214202Z * [new branch] flex_flash -> origin/flex_flash 2025-12-04T09:17:10.1216842Z * [new branch] fmassa/fix_memeff_sharding_rule -> origin/fmassa/fix_memeff_sharding_rule 2025-12-04T09:17:10.1219019Z * [new branch] fmassa/tests_comm_compute_scheduler -> origin/fmassa/tests_comm_compute_scheduler 2025-12-04T09:17:10.1221061Z * [new branch] forkserver_fix -> origin/forkserver_fix 2025-12-04T09:17:10.1223380Z * [new branch] fsdp2_trace_rules -> origin/fsdp2_trace_rules 2025-12-04T09:17:10.1225921Z * [new branch] fx_cpp -> origin/fx_cpp 2025-12-04T09:17:10.1228792Z * [new branch] fy/fix-win -> origin/fy/fix-win 2025-12-04T09:17:10.1231667Z * [new branch] galv-patch-1 -> origin/galv-patch-1 2025-12-04T09:17:10.1234841Z * [new branch] galv/cudagraphs-conditional-nodes-4 -> origin/galv/cudagraphs-conditional-nodes-4 2025-12-04T09:17:10.1237662Z * [new branch] georgehong/cmakelists-patch -> origin/georgehong/cmakelists-patch 2025-12-04T09:17:10.1241922Z * [new branch] gh/AlnisM/1/base -> origin/gh/AlnisM/1/base 2025-12-04T09:17:10.1244062Z * [new branch] gh/AlnisM/1/head -> origin/gh/AlnisM/1/head 2025-12-04T09:17:10.1247528Z * [new branch] gh/EikanWang/67/base -> origin/gh/EikanWang/67/base 2025-12-04T09:17:10.1249669Z * [new branch] gh/EikanWang/67/head -> origin/gh/EikanWang/67/head 2025-12-04T09:17:10.1253434Z * [new branch] gh/Gasoonjia/1/base -> origin/gh/Gasoonjia/1/base 2025-12-04T09:17:10.1255568Z * [new branch] gh/Gasoonjia/1/head -> origin/gh/Gasoonjia/1/head 2025-12-04T09:17:10.1258985Z * [new branch] gh/H-Huang/131/base -> origin/gh/H-Huang/131/base 2025-12-04T09:17:10.1261109Z * [new branch] gh/H-Huang/131/head -> origin/gh/H-Huang/131/head 2025-12-04T09:17:10.1263387Z * [new branch] gh/H-Huang/131/orig -> origin/gh/H-Huang/131/orig 2025-12-04T09:17:10.1266315Z * [new branch] gh/H-Huang/132/base -> origin/gh/H-Huang/132/base 2025-12-04T09:17:10.1268402Z * [new branch] gh/H-Huang/132/head -> origin/gh/H-Huang/132/head 2025-12-04T09:17:10.1270550Z * [new branch] gh/H-Huang/132/orig -> origin/gh/H-Huang/132/orig 2025-12-04T09:17:10.1273594Z * [new branch] gh/H-Huang/180/base -> origin/gh/H-Huang/180/base 2025-12-04T09:17:10.1275618Z * [new branch] gh/H-Huang/180/head -> origin/gh/H-Huang/180/head 2025-12-04T09:17:10.1277781Z * [new branch] gh/H-Huang/180/orig -> origin/gh/H-Huang/180/orig 2025-12-04T09:17:10.1280531Z * [new branch] gh/H-Huang/182/base -> origin/gh/H-Huang/182/base 2025-12-04T09:17:10.1282667Z * [new branch] gh/H-Huang/182/head -> origin/gh/H-Huang/182/head 2025-12-04T09:17:10.1284844Z * [new branch] gh/H-Huang/182/orig -> origin/gh/H-Huang/182/orig 2025-12-04T09:17:10.1287942Z * [new branch] gh/H-Huang/226/base -> origin/gh/H-Huang/226/base 2025-12-04T09:17:10.1290069Z * [new branch] gh/H-Huang/226/head -> origin/gh/H-Huang/226/head 2025-12-04T09:17:10.1292249Z * [new branch] gh/H-Huang/226/orig -> origin/gh/H-Huang/226/orig 2025-12-04T09:17:10.1295061Z * [new branch] gh/H-Huang/228/base -> origin/gh/H-Huang/228/base 2025-12-04T09:17:10.1297153Z * [new branch] gh/H-Huang/228/head -> origin/gh/H-Huang/228/head 2025-12-04T09:17:10.1299314Z * [new branch] gh/H-Huang/228/orig -> origin/gh/H-Huang/228/orig 2025-12-04T09:17:10.1302817Z * [new branch] gh/IvanKobzarev/150/base -> origin/gh/IvanKobzarev/150/base 2025-12-04T09:17:10.1305039Z * [new branch] gh/IvanKobzarev/150/head -> origin/gh/IvanKobzarev/150/head 2025-12-04T09:17:10.1307141Z * [new branch] gh/IvanKobzarev/150/orig -> origin/gh/IvanKobzarev/150/orig 2025-12-04T09:17:10.1310068Z * [new branch] gh/IvanKobzarev/157/base -> origin/gh/IvanKobzarev/157/base 2025-12-04T09:17:10.1312285Z * [new branch] gh/IvanKobzarev/157/head -> origin/gh/IvanKobzarev/157/head 2025-12-04T09:17:10.1314457Z * [new branch] gh/IvanKobzarev/157/orig -> origin/gh/IvanKobzarev/157/orig 2025-12-04T09:17:10.1317444Z * [new branch] gh/IvanKobzarev/159/base -> origin/gh/IvanKobzarev/159/base 2025-12-04T09:17:10.1319646Z * [new branch] gh/IvanKobzarev/159/head -> origin/gh/IvanKobzarev/159/head 2025-12-04T09:17:10.1321897Z * [new branch] gh/IvanKobzarev/159/orig -> origin/gh/IvanKobzarev/159/orig 2025-12-04T09:17:10.1325240Z * [new branch] gh/IvanKobzarev/162/base -> origin/gh/IvanKobzarev/162/base 2025-12-04T09:17:10.1327766Z * [new branch] gh/IvanKobzarev/162/head -> origin/gh/IvanKobzarev/162/head 2025-12-04T09:17:10.1329952Z * [new branch] gh/IvanKobzarev/162/orig -> origin/gh/IvanKobzarev/162/orig 2025-12-04T09:17:10.1332698Z * [new branch] gh/IvanKobzarev/163/base -> origin/gh/IvanKobzarev/163/base 2025-12-04T09:17:10.1334801Z * [new branch] gh/IvanKobzarev/163/head -> origin/gh/IvanKobzarev/163/head 2025-12-04T09:17:10.1336922Z * [new branch] gh/IvanKobzarev/163/orig -> origin/gh/IvanKobzarev/163/orig 2025-12-04T09:17:10.1339859Z * [new branch] gh/IvanKobzarev/166/base -> origin/gh/IvanKobzarev/166/base 2025-12-04T09:17:10.1342345Z * [new branch] gh/IvanKobzarev/166/head -> origin/gh/IvanKobzarev/166/head 2025-12-04T09:17:10.1344493Z * [new branch] gh/IvanKobzarev/166/orig -> origin/gh/IvanKobzarev/166/orig 2025-12-04T09:17:10.1347530Z * [new branch] gh/IvanKobzarev/167/base -> origin/gh/IvanKobzarev/167/base 2025-12-04T09:17:10.1349615Z * [new branch] gh/IvanKobzarev/167/head -> origin/gh/IvanKobzarev/167/head 2025-12-04T09:17:10.1351791Z * [new branch] gh/IvanKobzarev/167/orig -> origin/gh/IvanKobzarev/167/orig 2025-12-04T09:17:10.1354996Z * [new branch] gh/IvanKobzarev/168/base -> origin/gh/IvanKobzarev/168/base 2025-12-04T09:17:10.1357023Z * [new branch] gh/IvanKobzarev/168/head -> origin/gh/IvanKobzarev/168/head 2025-12-04T09:17:10.1359079Z * [new branch] gh/IvanKobzarev/168/orig -> origin/gh/IvanKobzarev/168/orig 2025-12-04T09:17:10.1361965Z * [new branch] gh/IvanKobzarev/169/base -> origin/gh/IvanKobzarev/169/base 2025-12-04T09:17:10.1364424Z * [new branch] gh/IvanKobzarev/169/head -> origin/gh/IvanKobzarev/169/head 2025-12-04T09:17:10.1366342Z * [new branch] gh/IvanKobzarev/169/orig -> origin/gh/IvanKobzarev/169/orig 2025-12-04T09:17:10.1369151Z * [new branch] gh/IvanKobzarev/170/base -> origin/gh/IvanKobzarev/170/base 2025-12-04T09:17:10.1371257Z * [new branch] gh/IvanKobzarev/170/head -> origin/gh/IvanKobzarev/170/head 2025-12-04T09:17:10.1373466Z * [new branch] gh/IvanKobzarev/170/orig -> origin/gh/IvanKobzarev/170/orig 2025-12-04T09:17:10.1376530Z * [new branch] gh/IvanKobzarev/171/base -> origin/gh/IvanKobzarev/171/base 2025-12-04T09:17:10.1378748Z * [new branch] gh/IvanKobzarev/171/head -> origin/gh/IvanKobzarev/171/head 2025-12-04T09:17:10.1381148Z * [new branch] gh/IvanKobzarev/171/orig -> origin/gh/IvanKobzarev/171/orig 2025-12-04T09:17:10.1384081Z * [new branch] gh/IvanKobzarev/172/base -> origin/gh/IvanKobzarev/172/base 2025-12-04T09:17:10.1386456Z * [new branch] gh/IvanKobzarev/172/head -> origin/gh/IvanKobzarev/172/head 2025-12-04T09:17:10.1388574Z * [new branch] gh/IvanKobzarev/172/orig -> origin/gh/IvanKobzarev/172/orig 2025-12-04T09:17:10.1391412Z * [new branch] gh/IvanKobzarev/173/base -> origin/gh/IvanKobzarev/173/base 2025-12-04T09:17:10.1393647Z * [new branch] gh/IvanKobzarev/173/head -> origin/gh/IvanKobzarev/173/head 2025-12-04T09:17:10.1395777Z * [new branch] gh/IvanKobzarev/173/orig -> origin/gh/IvanKobzarev/173/orig 2025-12-04T09:17:10.1398657Z * [new branch] gh/IvanKobzarev/174/base -> origin/gh/IvanKobzarev/174/base 2025-12-04T09:17:10.1400938Z * [new branch] gh/IvanKobzarev/174/head -> origin/gh/IvanKobzarev/174/head 2025-12-04T09:17:10.1403086Z * [new branch] gh/IvanKobzarev/174/orig -> origin/gh/IvanKobzarev/174/orig 2025-12-04T09:17:10.1405965Z * [new branch] gh/IvanKobzarev/175/base -> origin/gh/IvanKobzarev/175/base 2025-12-04T09:17:10.1408293Z * [new branch] gh/IvanKobzarev/175/head -> origin/gh/IvanKobzarev/175/head 2025-12-04T09:17:10.1410559Z * [new branch] gh/IvanKobzarev/175/orig -> origin/gh/IvanKobzarev/175/orig 2025-12-04T09:17:10.1413592Z * [new branch] gh/IvanKobzarev/176/base -> origin/gh/IvanKobzarev/176/base 2025-12-04T09:17:10.1415753Z * [new branch] gh/IvanKobzarev/176/head -> origin/gh/IvanKobzarev/176/head 2025-12-04T09:17:10.1417816Z * [new branch] gh/IvanKobzarev/176/orig -> origin/gh/IvanKobzarev/176/orig 2025-12-04T09:17:10.1421418Z * [new branch] gh/IvanKobzarev/177/base -> origin/gh/IvanKobzarev/177/base 2025-12-04T09:17:10.1423560Z * [new branch] gh/IvanKobzarev/177/head -> origin/gh/IvanKobzarev/177/head 2025-12-04T09:17:10.1426008Z * [new branch] gh/IvanKobzarev/177/orig -> origin/gh/IvanKobzarev/177/orig 2025-12-04T09:17:10.1429302Z * [new branch] gh/IvanKobzarev/178/base -> origin/gh/IvanKobzarev/178/base 2025-12-04T09:17:10.1431337Z * [new branch] gh/IvanKobzarev/178/head -> origin/gh/IvanKobzarev/178/head 2025-12-04T09:17:10.1433581Z * [new branch] gh/IvanKobzarev/178/orig -> origin/gh/IvanKobzarev/178/orig 2025-12-04T09:17:10.1436861Z * [new branch] gh/IvanKobzarev/179/base -> origin/gh/IvanKobzarev/179/base 2025-12-04T09:17:10.1438999Z * [new branch] gh/IvanKobzarev/179/head -> origin/gh/IvanKobzarev/179/head 2025-12-04T09:17:10.1441251Z * [new branch] gh/IvanKobzarev/179/orig -> origin/gh/IvanKobzarev/179/orig 2025-12-04T09:17:10.1444385Z * [new branch] gh/IvanKobzarev/180/base -> origin/gh/IvanKobzarev/180/base 2025-12-04T09:17:10.1446641Z * [new branch] gh/IvanKobzarev/180/head -> origin/gh/IvanKobzarev/180/head 2025-12-04T09:17:10.1448745Z * [new branch] gh/IvanKobzarev/180/orig -> origin/gh/IvanKobzarev/180/orig 2025-12-04T09:17:10.1452155Z * [new branch] gh/IvanKobzarev/181/base -> origin/gh/IvanKobzarev/181/base 2025-12-04T09:17:10.1454189Z * [new branch] gh/IvanKobzarev/181/head -> origin/gh/IvanKobzarev/181/head 2025-12-04T09:17:10.1456290Z * [new branch] gh/IvanKobzarev/181/orig -> origin/gh/IvanKobzarev/181/orig 2025-12-04T09:17:10.1459754Z * [new branch] gh/IvanKobzarev/182/base -> origin/gh/IvanKobzarev/182/base 2025-12-04T09:17:10.1461724Z * [new branch] gh/IvanKobzarev/182/head -> origin/gh/IvanKobzarev/182/head 2025-12-04T09:17:10.1463988Z * [new branch] gh/IvanKobzarev/182/orig -> origin/gh/IvanKobzarev/182/orig 2025-12-04T09:17:10.1467092Z * [new branch] gh/IvanKobzarev/183/base -> origin/gh/IvanKobzarev/183/base 2025-12-04T09:17:10.1469643Z * [new branch] gh/IvanKobzarev/183/head -> origin/gh/IvanKobzarev/183/head 2025-12-04T09:17:10.1471598Z * [new branch] gh/IvanKobzarev/183/orig -> origin/gh/IvanKobzarev/183/orig 2025-12-04T09:17:10.1474734Z * [new branch] gh/IvanKobzarev/184/base -> origin/gh/IvanKobzarev/184/base 2025-12-04T09:17:10.1477212Z * [new branch] gh/IvanKobzarev/184/head -> origin/gh/IvanKobzarev/184/head 2025-12-04T09:17:10.1479255Z * [new branch] gh/IvanKobzarev/184/orig -> origin/gh/IvanKobzarev/184/orig 2025-12-04T09:17:10.1482993Z * [new branch] gh/NikhilAPatel/1/base -> origin/gh/NikhilAPatel/1/base 2025-12-04T09:17:10.1485025Z * [new branch] gh/NikhilAPatel/1/head -> origin/gh/NikhilAPatel/1/head 2025-12-04T09:17:10.1487763Z * [new branch] gh/NikhilAPatel/2/base -> origin/gh/NikhilAPatel/2/base 2025-12-04T09:17:10.1489893Z * [new branch] gh/NikhilAPatel/2/head -> origin/gh/NikhilAPatel/2/head 2025-12-04T09:17:10.1493218Z * [new branch] gh/NikhilAPatel/4/base -> origin/gh/NikhilAPatel/4/base 2025-12-04T09:17:10.1495620Z * [new branch] gh/NikhilAPatel/4/head -> origin/gh/NikhilAPatel/4/head 2025-12-04T09:17:10.1498266Z * [new branch] gh/NikhilAPatel/5/base -> origin/gh/NikhilAPatel/5/base 2025-12-04T09:17:10.1500764Z * [new branch] gh/NikhilAPatel/5/head -> origin/gh/NikhilAPatel/5/head 2025-12-04T09:17:10.1503178Z * [new branch] gh/NikhilAPatel/5/orig -> origin/gh/NikhilAPatel/5/orig 2025-12-04T09:17:10.1506385Z * [new branch] gh/PaliC/17/base -> origin/gh/PaliC/17/base 2025-12-04T09:17:10.1508600Z * [new branch] gh/PaliC/17/head -> origin/gh/PaliC/17/head 2025-12-04T09:17:10.1510773Z * [new branch] gh/PaliC/17/orig -> origin/gh/PaliC/17/orig 2025-12-04T09:17:10.1513937Z * [new branch] gh/PaliC/18/base -> origin/gh/PaliC/18/base 2025-12-04T09:17:10.1516033Z * [new branch] gh/PaliC/18/head -> origin/gh/PaliC/18/head 2025-12-04T09:17:10.1518272Z * [new branch] gh/PaliC/18/orig -> origin/gh/PaliC/18/orig 2025-12-04T09:17:10.1521096Z * [new branch] gh/PaliC/20/base -> origin/gh/PaliC/20/base 2025-12-04T09:17:10.1536437Z * [new branch] gh/PaliC/20/head -> origin/gh/PaliC/20/head 2025-12-04T09:17:10.1537292Z * [new branch] gh/PaliC/20/orig -> origin/gh/PaliC/20/orig 2025-12-04T09:17:10.1538122Z * [new branch] gh/PaliC/21/base -> origin/gh/PaliC/21/base 2025-12-04T09:17:10.1539033Z * [new branch] gh/PaliC/21/head -> origin/gh/PaliC/21/head 2025-12-04T09:17:10.1539547Z * [new branch] gh/PaliC/21/orig -> origin/gh/PaliC/21/orig 2025-12-04T09:17:10.1540043Z * [new branch] gh/PaliC/23/base -> origin/gh/PaliC/23/base 2025-12-04T09:17:10.1540737Z * [new branch] gh/PaliC/23/head -> origin/gh/PaliC/23/head 2025-12-04T09:17:10.1541438Z * [new branch] gh/PaliC/23/orig -> origin/gh/PaliC/23/orig 2025-12-04T09:17:10.1542093Z * [new branch] gh/PaliC/24/base -> origin/gh/PaliC/24/base 2025-12-04T09:17:10.1543205Z * [new branch] gh/PaliC/24/head -> origin/gh/PaliC/24/head 2025-12-04T09:17:10.1545673Z * [new branch] gh/PaliC/24/orig -> origin/gh/PaliC/24/orig 2025-12-04T09:17:10.1548389Z * [new branch] gh/PaliC/25/head -> origin/gh/PaliC/25/head 2025-12-04T09:17:10.1550562Z * [new branch] gh/PaliC/25/next -> origin/gh/PaliC/25/next 2025-12-04T09:17:10.1552794Z * [new branch] gh/PaliC/25/orig -> origin/gh/PaliC/25/orig 2025-12-04T09:17:10.1555715Z * [new branch] gh/PaliC/26/head -> origin/gh/PaliC/26/head 2025-12-04T09:17:10.1557630Z * [new branch] gh/PaliC/26/next -> origin/gh/PaliC/26/next 2025-12-04T09:17:10.1559863Z * [new branch] gh/PaliC/26/orig -> origin/gh/PaliC/26/orig 2025-12-04T09:17:10.1562659Z * [new branch] gh/PaliC/27/next -> origin/gh/PaliC/27/next 2025-12-04T09:17:10.1565433Z * [new branch] gh/PaliC/28/head -> origin/gh/PaliC/28/head 2025-12-04T09:17:10.1567422Z * [new branch] gh/PaliC/28/next -> origin/gh/PaliC/28/next 2025-12-04T09:17:10.1569611Z * [new branch] gh/PaliC/28/orig -> origin/gh/PaliC/28/orig 2025-12-04T09:17:10.1572839Z * [new branch] gh/PaliC/29/head -> origin/gh/PaliC/29/head 2025-12-04T09:17:10.1574285Z * [new branch] gh/PaliC/29/next -> origin/gh/PaliC/29/next 2025-12-04T09:17:10.1576624Z * [new branch] gh/PaliC/29/orig -> origin/gh/PaliC/29/orig 2025-12-04T09:17:10.1579645Z * [new branch] gh/PaliC/30/head -> origin/gh/PaliC/30/head 2025-12-04T09:17:10.1581770Z * [new branch] gh/PaliC/30/next -> origin/gh/PaliC/30/next 2025-12-04T09:17:10.1583859Z * [new branch] gh/PaliC/30/orig -> origin/gh/PaliC/30/orig 2025-12-04T09:17:10.1586706Z * [new branch] gh/PaliC/31/head -> origin/gh/PaliC/31/head 2025-12-04T09:17:10.1589422Z * [new branch] gh/PaliC/31/next -> origin/gh/PaliC/31/next 2025-12-04T09:17:10.1591375Z * [new branch] gh/PaliC/31/orig -> origin/gh/PaliC/31/orig 2025-12-04T09:17:10.1594704Z * [new branch] gh/PaulZhang12/25/base -> origin/gh/PaulZhang12/25/base 2025-12-04T09:17:10.1597021Z * [new branch] gh/PaulZhang12/25/head -> origin/gh/PaulZhang12/25/head 2025-12-04T09:17:10.1599158Z * [new branch] gh/PaulZhang12/25/orig -> origin/gh/PaulZhang12/25/orig 2025-12-04T09:17:10.1602381Z * [new branch] gh/PaulZhang12/28/base -> origin/gh/PaulZhang12/28/base 2025-12-04T09:17:10.1604334Z * [new branch] gh/PaulZhang12/28/head -> origin/gh/PaulZhang12/28/head 2025-12-04T09:17:10.1606487Z * [new branch] gh/PaulZhang12/28/orig -> origin/gh/PaulZhang12/28/orig 2025-12-04T09:17:10.1609607Z * [new branch] gh/PaulZhang12/31/base -> origin/gh/PaulZhang12/31/base 2025-12-04T09:17:10.1612366Z * [new branch] gh/PaulZhang12/31/head -> origin/gh/PaulZhang12/31/head 2025-12-04T09:17:10.1614650Z * [new branch] gh/PaulZhang12/31/orig -> origin/gh/PaulZhang12/31/orig 2025-12-04T09:17:10.1617042Z * [new branch] gh/PaulZhang12/37/base -> origin/gh/PaulZhang12/37/base 2025-12-04T09:17:10.1618502Z * [new branch] gh/PaulZhang12/37/head -> origin/gh/PaulZhang12/37/head 2025-12-04T09:17:10.1621258Z * [new branch] gh/PaulZhang12/37/orig -> origin/gh/PaulZhang12/37/orig 2025-12-04T09:17:10.1624162Z * [new branch] gh/PaulZhang12/40/base -> origin/gh/PaulZhang12/40/base 2025-12-04T09:17:10.1626619Z * [new branch] gh/PaulZhang12/40/head -> origin/gh/PaulZhang12/40/head 2025-12-04T09:17:10.1628696Z * [new branch] gh/PaulZhang12/40/orig -> origin/gh/PaulZhang12/40/orig 2025-12-04T09:17:10.1631538Z * [new branch] gh/PaulZhang12/42/base -> origin/gh/PaulZhang12/42/base 2025-12-04T09:17:10.1633734Z * [new branch] gh/PaulZhang12/42/head -> origin/gh/PaulZhang12/42/head 2025-12-04T09:17:10.1636555Z * [new branch] gh/PaulZhang12/43/base -> origin/gh/PaulZhang12/43/base 2025-12-04T09:17:10.1638681Z * [new branch] gh/PaulZhang12/43/head -> origin/gh/PaulZhang12/43/head 2025-12-04T09:17:10.1640772Z * [new branch] gh/PaulZhang12/43/orig -> origin/gh/PaulZhang12/43/orig 2025-12-04T09:17:10.1643731Z * [new branch] gh/PaulZhang12/44/base -> origin/gh/PaulZhang12/44/base 2025-12-04T09:17:10.1645760Z * [new branch] gh/PaulZhang12/44/head -> origin/gh/PaulZhang12/44/head 2025-12-04T09:17:10.1648636Z * [new branch] gh/PaulZhang12/45/base -> origin/gh/PaulZhang12/45/base 2025-12-04T09:17:10.1651054Z * [new branch] gh/PaulZhang12/45/head -> origin/gh/PaulZhang12/45/head 2025-12-04T09:17:10.1652914Z * [new branch] gh/PaulZhang12/45/orig -> origin/gh/PaulZhang12/45/orig 2025-12-04T09:17:10.1655845Z * [new branch] gh/PaulZhang12/46/base -> origin/gh/PaulZhang12/46/base 2025-12-04T09:17:10.1657958Z * [new branch] gh/PaulZhang12/46/head -> origin/gh/PaulZhang12/46/head 2025-12-04T09:17:10.1660154Z * [new branch] gh/PaulZhang12/46/orig -> origin/gh/PaulZhang12/46/orig 2025-12-04T09:17:10.1663126Z * [new branch] gh/PaulZhang12/47/base -> origin/gh/PaulZhang12/47/base 2025-12-04T09:17:10.1665683Z * [new branch] gh/PaulZhang12/47/head -> origin/gh/PaulZhang12/47/head 2025-12-04T09:17:10.1667643Z * [new branch] gh/PaulZhang12/47/orig -> origin/gh/PaulZhang12/47/orig 2025-12-04T09:17:10.1670325Z * [new branch] gh/PaulZhang12/48/base -> origin/gh/PaulZhang12/48/base 2025-12-04T09:17:10.1672940Z * [new branch] gh/PaulZhang12/48/head -> origin/gh/PaulZhang12/48/head 2025-12-04T09:17:10.1675110Z * [new branch] gh/PaulZhang12/48/orig -> origin/gh/PaulZhang12/48/orig 2025-12-04T09:17:10.1678481Z * [new branch] gh/SamGinzburg/11/base -> origin/gh/SamGinzburg/11/base 2025-12-04T09:17:10.1680594Z * [new branch] gh/SamGinzburg/11/head -> origin/gh/SamGinzburg/11/head 2025-12-04T09:17:10.1684037Z * [new branch] gh/SherlockNoMad/1/base -> origin/gh/SherlockNoMad/1/base 2025-12-04T09:17:10.1686282Z * [new branch] gh/SherlockNoMad/1/head -> origin/gh/SherlockNoMad/1/head 2025-12-04T09:17:10.1689158Z * [new branch] gh/SherlockNoMad/10/base -> origin/gh/SherlockNoMad/10/base 2025-12-04T09:17:10.1691338Z * [new branch] gh/SherlockNoMad/10/head -> origin/gh/SherlockNoMad/10/head 2025-12-04T09:17:10.1693453Z * [new branch] gh/SherlockNoMad/10/orig -> origin/gh/SherlockNoMad/10/orig 2025-12-04T09:17:10.1696239Z * [new branch] gh/SherlockNoMad/11/base -> origin/gh/SherlockNoMad/11/base 2025-12-04T09:17:10.1698429Z * [new branch] gh/SherlockNoMad/11/head -> origin/gh/SherlockNoMad/11/head 2025-12-04T09:17:10.1700814Z * [new branch] gh/SherlockNoMad/11/orig -> origin/gh/SherlockNoMad/11/orig 2025-12-04T09:17:10.1703525Z * [new branch] gh/SherlockNoMad/12/base -> origin/gh/SherlockNoMad/12/base 2025-12-04T09:17:10.1706015Z * [new branch] gh/SherlockNoMad/12/head -> origin/gh/SherlockNoMad/12/head 2025-12-04T09:17:10.1707720Z * [new branch] gh/SherlockNoMad/12/orig -> origin/gh/SherlockNoMad/12/orig 2025-12-04T09:17:10.1710963Z * [new branch] gh/SherlockNoMad/15/base -> origin/gh/SherlockNoMad/15/base 2025-12-04T09:17:10.1713017Z * [new branch] gh/SherlockNoMad/15/head -> origin/gh/SherlockNoMad/15/head 2025-12-04T09:17:10.1715173Z * [new branch] gh/SherlockNoMad/15/orig -> origin/gh/SherlockNoMad/15/orig 2025-12-04T09:17:10.1718009Z * [new branch] gh/SherlockNoMad/17/base -> origin/gh/SherlockNoMad/17/base 2025-12-04T09:17:10.1720129Z * [new branch] gh/SherlockNoMad/17/head -> origin/gh/SherlockNoMad/17/head 2025-12-04T09:17:10.1722236Z * [new branch] gh/SherlockNoMad/17/orig -> origin/gh/SherlockNoMad/17/orig 2025-12-04T09:17:10.1725398Z * [new branch] gh/SherlockNoMad/18/base -> origin/gh/SherlockNoMad/18/base 2025-12-04T09:17:10.1730459Z * [new branch] gh/SherlockNoMad/18/head -> origin/gh/SherlockNoMad/18/head 2025-12-04T09:17:10.1732031Z * [new branch] gh/SherlockNoMad/18/orig -> origin/gh/SherlockNoMad/18/orig 2025-12-04T09:17:10.1734927Z * [new branch] gh/SherlockNoMad/19/base -> origin/gh/SherlockNoMad/19/base 2025-12-04T09:17:10.1737235Z * [new branch] gh/SherlockNoMad/19/head -> origin/gh/SherlockNoMad/19/head 2025-12-04T09:17:10.1739365Z * [new branch] gh/SherlockNoMad/19/orig -> origin/gh/SherlockNoMad/19/orig 2025-12-04T09:17:10.1742426Z * [new branch] gh/SherlockNoMad/2/base -> origin/gh/SherlockNoMad/2/base 2025-12-04T09:17:10.1744433Z * [new branch] gh/SherlockNoMad/2/head -> origin/gh/SherlockNoMad/2/head 2025-12-04T09:17:10.1747267Z * [new branch] gh/SherlockNoMad/20/base -> origin/gh/SherlockNoMad/20/base 2025-12-04T09:17:10.1749422Z * [new branch] gh/SherlockNoMad/20/head -> origin/gh/SherlockNoMad/20/head 2025-12-04T09:17:10.1751515Z * [new branch] gh/SherlockNoMad/20/orig -> origin/gh/SherlockNoMad/20/orig 2025-12-04T09:17:10.1754707Z * [new branch] gh/SherlockNoMad/21/base -> origin/gh/SherlockNoMad/21/base 2025-12-04T09:17:10.1756873Z * [new branch] gh/SherlockNoMad/21/head -> origin/gh/SherlockNoMad/21/head 2025-12-04T09:17:10.1759271Z * [new branch] gh/SherlockNoMad/21/orig -> origin/gh/SherlockNoMad/21/orig 2025-12-04T09:17:10.1761790Z * [new branch] gh/SherlockNoMad/3/base -> origin/gh/SherlockNoMad/3/base 2025-12-04T09:17:10.1764217Z * [new branch] gh/SherlockNoMad/3/head -> origin/gh/SherlockNoMad/3/head 2025-12-04T09:17:10.1766626Z * [new branch] gh/SherlockNoMad/4/base -> origin/gh/SherlockNoMad/4/base 2025-12-04T09:17:10.1768857Z * [new branch] gh/SherlockNoMad/4/head -> origin/gh/SherlockNoMad/4/head 2025-12-04T09:17:10.1771656Z * [new branch] gh/SherlockNoMad/5/base -> origin/gh/SherlockNoMad/5/base 2025-12-04T09:17:10.1773706Z * [new branch] gh/SherlockNoMad/5/head -> origin/gh/SherlockNoMad/5/head 2025-12-04T09:17:10.1778153Z * [new branch] gh/Sidharth123-cpu/24/base -> origin/gh/Sidharth123-cpu/24/base 2025-12-04T09:17:10.1780565Z * [new branch] gh/Sidharth123-cpu/25/base -> origin/gh/Sidharth123-cpu/25/base 2025-12-04T09:17:10.1783478Z * [new branch] gh/Sidharth123-cpu/26/base -> origin/gh/Sidharth123-cpu/26/base 2025-12-04T09:17:10.1786765Z * [new branch] gh/Sidharth123-cpu/27/base -> origin/gh/Sidharth123-cpu/27/base 2025-12-04T09:17:10.1790156Z * [new branch] gh/StrongerXi/1/base -> origin/gh/StrongerXi/1/base 2025-12-04T09:17:10.1792049Z * [new branch] gh/StrongerXi/1/head -> origin/gh/StrongerXi/1/head 2025-12-04T09:17:10.1795336Z * [new branch] gh/StrongerXi/71/base -> origin/gh/StrongerXi/71/base 2025-12-04T09:17:10.1796940Z * [new branch] gh/StrongerXi/71/head -> origin/gh/StrongerXi/71/head 2025-12-04T09:17:10.1800040Z * [new branch] gh/StrongerXi/72/base -> origin/gh/StrongerXi/72/base 2025-12-04T09:17:10.1802423Z * [new branch] gh/StrongerXi/72/head -> origin/gh/StrongerXi/72/head 2025-12-04T09:17:10.1805009Z * [new branch] gh/StrongerXi/73/base -> origin/gh/StrongerXi/73/base 2025-12-04T09:17:10.1807132Z * [new branch] gh/StrongerXi/73/head -> origin/gh/StrongerXi/73/head 2025-12-04T09:17:10.1809317Z * [new branch] gh/StrongerXi/73/orig -> origin/gh/StrongerXi/73/orig 2025-12-04T09:17:10.1812844Z * [new branch] gh/XilunWu/160/base -> origin/gh/XilunWu/160/base 2025-12-04T09:17:10.1814975Z * [new branch] gh/XilunWu/160/head -> origin/gh/XilunWu/160/head 2025-12-04T09:17:10.1817021Z * [new branch] gh/XilunWu/160/orig -> origin/gh/XilunWu/160/orig 2025-12-04T09:17:10.1820230Z * [new branch] gh/XilunWu/163/base -> origin/gh/XilunWu/163/base 2025-12-04T09:17:10.1822189Z * [new branch] gh/XilunWu/163/head -> origin/gh/XilunWu/163/head 2025-12-04T09:17:10.1824721Z * [new branch] gh/XilunWu/163/orig -> origin/gh/XilunWu/163/orig 2025-12-04T09:17:10.1827800Z * [new branch] gh/XilunWu/168/base -> origin/gh/XilunWu/168/base 2025-12-04T09:17:10.1829844Z * [new branch] gh/XilunWu/168/head -> origin/gh/XilunWu/168/head 2025-12-04T09:17:10.1832302Z * [new branch] gh/XilunWu/168/orig -> origin/gh/XilunWu/168/orig 2025-12-04T09:17:10.1834911Z * [new branch] gh/XilunWu/169/base -> origin/gh/XilunWu/169/base 2025-12-04T09:17:10.1837020Z * [new branch] gh/XilunWu/169/head -> origin/gh/XilunWu/169/head 2025-12-04T09:17:10.1839513Z * [new branch] gh/XilunWu/169/orig -> origin/gh/XilunWu/169/orig 2025-12-04T09:17:10.1841979Z * [new branch] gh/XilunWu/170/base -> origin/gh/XilunWu/170/base 2025-12-04T09:17:10.1844144Z * [new branch] gh/XilunWu/170/head -> origin/gh/XilunWu/170/head 2025-12-04T09:17:10.1846434Z * [new branch] gh/XilunWu/170/orig -> origin/gh/XilunWu/170/orig 2025-12-04T09:17:10.1849685Z * [new branch] gh/XilunWu/171/base -> origin/gh/XilunWu/171/base 2025-12-04T09:17:10.1851606Z * [new branch] gh/XilunWu/171/head -> origin/gh/XilunWu/171/head 2025-12-04T09:17:10.1853924Z * [new branch] gh/XilunWu/171/orig -> origin/gh/XilunWu/171/orig 2025-12-04T09:17:10.1856642Z * [new branch] gh/XilunWu/173/base -> origin/gh/XilunWu/173/base 2025-12-04T09:17:10.1859004Z * [new branch] gh/XilunWu/173/head -> origin/gh/XilunWu/173/head 2025-12-04T09:17:10.1861026Z * [new branch] gh/XilunWu/173/orig -> origin/gh/XilunWu/173/orig 2025-12-04T09:17:10.1864126Z * [new branch] gh/XilunWu/175/base -> origin/gh/XilunWu/175/base 2025-12-04T09:17:10.1866590Z * [new branch] gh/XilunWu/175/head -> origin/gh/XilunWu/175/head 2025-12-04T09:17:10.1868497Z * [new branch] gh/XilunWu/175/orig -> origin/gh/XilunWu/175/orig 2025-12-04T09:17:10.1871576Z * [new branch] gh/XilunWu/176/base -> origin/gh/XilunWu/176/base 2025-12-04T09:17:10.1873649Z * [new branch] gh/XilunWu/176/head -> origin/gh/XilunWu/176/head 2025-12-04T09:17:10.1876176Z * [new branch] gh/XilunWu/176/orig -> origin/gh/XilunWu/176/orig 2025-12-04T09:17:10.1879519Z * [new branch] gh/XuehaiPan/14/base -> origin/gh/XuehaiPan/14/base 2025-12-04T09:17:10.1881964Z * [new branch] gh/XuehaiPan/14/head -> origin/gh/XuehaiPan/14/head 2025-12-04T09:17:10.1883672Z * [new branch] gh/XuehaiPan/14/orig -> origin/gh/XuehaiPan/14/orig 2025-12-04T09:17:10.1886891Z * [new branch] gh/XuehaiPan/179/base -> origin/gh/XuehaiPan/179/base 2025-12-04T09:17:10.1888990Z * [new branch] gh/XuehaiPan/179/head -> origin/gh/XuehaiPan/179/head 2025-12-04T09:17:10.1891247Z * [new branch] gh/XuehaiPan/179/orig -> origin/gh/XuehaiPan/179/orig 2025-12-04T09:17:10.1894050Z * [new branch] gh/XuehaiPan/249/base -> origin/gh/XuehaiPan/249/base 2025-12-04T09:17:10.1896545Z * [new branch] gh/XuehaiPan/249/head -> origin/gh/XuehaiPan/249/head 2025-12-04T09:17:10.1898538Z * [new branch] gh/XuehaiPan/249/orig -> origin/gh/XuehaiPan/249/orig 2025-12-04T09:17:10.1901384Z * [new branch] gh/XuehaiPan/253/base -> origin/gh/XuehaiPan/253/base 2025-12-04T09:17:10.1903970Z * [new branch] gh/XuehaiPan/253/head -> origin/gh/XuehaiPan/253/head 2025-12-04T09:17:10.1905877Z * [new branch] gh/XuehaiPan/253/orig -> origin/gh/XuehaiPan/253/orig 2025-12-04T09:17:10.1908938Z * [new branch] gh/XuehaiPan/254/base -> origin/gh/XuehaiPan/254/base 2025-12-04T09:17:10.1910946Z * [new branch] gh/XuehaiPan/254/head -> origin/gh/XuehaiPan/254/head 2025-12-04T09:17:10.1913114Z * [new branch] gh/XuehaiPan/254/orig -> origin/gh/XuehaiPan/254/orig 2025-12-04T09:17:10.1915947Z * [new branch] gh/XuehaiPan/255/base -> origin/gh/XuehaiPan/255/base 2025-12-04T09:17:10.1918469Z * [new branch] gh/XuehaiPan/255/head -> origin/gh/XuehaiPan/255/head 2025-12-04T09:17:10.1920367Z * [new branch] gh/XuehaiPan/255/orig -> origin/gh/XuehaiPan/255/orig 2025-12-04T09:17:10.1923307Z * [new branch] gh/XuehaiPan/271/base -> origin/gh/XuehaiPan/271/base 2025-12-04T09:17:10.1925714Z * [new branch] gh/XuehaiPan/271/head -> origin/gh/XuehaiPan/271/head 2025-12-04T09:17:10.1927934Z * [new branch] gh/XuehaiPan/271/orig -> origin/gh/XuehaiPan/271/orig 2025-12-04T09:17:10.1930681Z * [new branch] gh/XuehaiPan/343/base -> origin/gh/XuehaiPan/343/base 2025-12-04T09:17:10.1932797Z * [new branch] gh/XuehaiPan/343/head -> origin/gh/XuehaiPan/343/head 2025-12-04T09:17:10.1934919Z * [new branch] gh/XuehaiPan/343/orig -> origin/gh/XuehaiPan/343/orig 2025-12-04T09:17:10.1937849Z * [new branch] gh/XuehaiPan/347/base -> origin/gh/XuehaiPan/347/base 2025-12-04T09:17:10.1940044Z * [new branch] gh/XuehaiPan/347/head -> origin/gh/XuehaiPan/347/head 2025-12-04T09:17:10.1942429Z * [new branch] gh/XuehaiPan/347/orig -> origin/gh/XuehaiPan/347/orig 2025-12-04T09:17:10.1945401Z * [new branch] gh/XuehaiPan/348/base -> origin/gh/XuehaiPan/348/base 2025-12-04T09:17:10.1947434Z * [new branch] gh/XuehaiPan/348/head -> origin/gh/XuehaiPan/348/head 2025-12-04T09:17:10.1949907Z * [new branch] gh/XuehaiPan/348/orig -> origin/gh/XuehaiPan/348/orig 2025-12-04T09:17:10.1952529Z * [new branch] gh/XuehaiPan/350/base -> origin/gh/XuehaiPan/350/base 2025-12-04T09:17:10.1954630Z * [new branch] gh/XuehaiPan/350/head -> origin/gh/XuehaiPan/350/head 2025-12-04T09:17:10.1956759Z * [new branch] gh/XuehaiPan/350/orig -> origin/gh/XuehaiPan/350/orig 2025-12-04T09:17:10.1960149Z * [new branch] gh/XuehaiPan/365/base -> origin/gh/XuehaiPan/365/base 2025-12-04T09:17:10.1961970Z * [new branch] gh/XuehaiPan/365/head -> origin/gh/XuehaiPan/365/head 2025-12-04T09:17:10.1964048Z * [new branch] gh/XuehaiPan/365/orig -> origin/gh/XuehaiPan/365/orig 2025-12-04T09:17:10.1967090Z * [new branch] gh/XuehaiPan/366/base -> origin/gh/XuehaiPan/366/base 2025-12-04T09:17:10.1969929Z * [new branch] gh/XuehaiPan/366/head -> origin/gh/XuehaiPan/366/head 2025-12-04T09:17:10.1972543Z * [new branch] gh/XuehaiPan/370/base -> origin/gh/XuehaiPan/370/base 2025-12-04T09:17:10.1974996Z * [new branch] gh/XuehaiPan/370/head -> origin/gh/XuehaiPan/370/head 2025-12-04T09:17:10.1976922Z * [new branch] gh/XuehaiPan/370/orig -> origin/gh/XuehaiPan/370/orig 2025-12-04T09:17:10.1979851Z * [new branch] gh/XuehaiPan/390/base -> origin/gh/XuehaiPan/390/base 2025-12-04T09:17:10.1982157Z * [new branch] gh/XuehaiPan/390/head -> origin/gh/XuehaiPan/390/head 2025-12-04T09:17:10.1984672Z * [new branch] gh/XuehaiPan/390/orig -> origin/gh/XuehaiPan/390/orig 2025-12-04T09:17:10.1987608Z * [new branch] gh/XuehaiPan/391/base -> origin/gh/XuehaiPan/391/base 2025-12-04T09:17:10.1989504Z * [new branch] gh/XuehaiPan/391/head -> origin/gh/XuehaiPan/391/head 2025-12-04T09:17:10.1991732Z * [new branch] gh/XuehaiPan/391/orig -> origin/gh/XuehaiPan/391/orig 2025-12-04T09:17:10.1994705Z * [new branch] gh/XuehaiPan/392/base -> origin/gh/XuehaiPan/392/base 2025-12-04T09:17:10.1996778Z * [new branch] gh/XuehaiPan/392/head -> origin/gh/XuehaiPan/392/head 2025-12-04T09:17:10.1998889Z * [new branch] gh/XuehaiPan/392/orig -> origin/gh/XuehaiPan/392/orig 2025-12-04T09:17:10.2002403Z * [new branch] gh/XuehaiPan/394/base -> origin/gh/XuehaiPan/394/base 2025-12-04T09:17:10.2004478Z * [new branch] gh/XuehaiPan/394/head -> origin/gh/XuehaiPan/394/head 2025-12-04T09:17:10.2006762Z * [new branch] gh/XuehaiPan/394/orig -> origin/gh/XuehaiPan/394/orig 2025-12-04T09:17:10.2009746Z * [new branch] gh/XuehaiPan/397/base -> origin/gh/XuehaiPan/397/base 2025-12-04T09:17:10.2011830Z * [new branch] gh/XuehaiPan/397/head -> origin/gh/XuehaiPan/397/head 2025-12-04T09:17:10.2014009Z * [new branch] gh/XuehaiPan/397/orig -> origin/gh/XuehaiPan/397/orig 2025-12-04T09:17:10.2017293Z * [new branch] gh/XuehaiPan/398/base -> origin/gh/XuehaiPan/398/base 2025-12-04T09:17:10.2019120Z * [new branch] gh/XuehaiPan/398/head -> origin/gh/XuehaiPan/398/head 2025-12-04T09:17:10.2021466Z * [new branch] gh/XuehaiPan/398/orig -> origin/gh/XuehaiPan/398/orig 2025-12-04T09:17:10.2024985Z * [new branch] gh/XuehaiPan/399/base -> origin/gh/XuehaiPan/399/base 2025-12-04T09:17:10.2026987Z * [new branch] gh/XuehaiPan/399/head -> origin/gh/XuehaiPan/399/head 2025-12-04T09:17:10.2029081Z * [new branch] gh/XuehaiPan/399/orig -> origin/gh/XuehaiPan/399/orig 2025-12-04T09:17:10.2032361Z * [new branch] gh/XuehaiPan/400/base -> origin/gh/XuehaiPan/400/base 2025-12-04T09:17:10.2034292Z * [new branch] gh/XuehaiPan/400/head -> origin/gh/XuehaiPan/400/head 2025-12-04T09:17:10.2036823Z * [new branch] gh/XuehaiPan/400/orig -> origin/gh/XuehaiPan/400/orig 2025-12-04T09:17:10.2040077Z * [new branch] gh/ZhiweiYan-96/39/base -> origin/gh/ZhiweiYan-96/39/base 2025-12-04T09:17:10.2042320Z * [new branch] gh/ZhiweiYan-96/39/head -> origin/gh/ZhiweiYan-96/39/head 2025-12-04T09:17:10.2044470Z * [new branch] gh/ZhiweiYan-96/39/orig -> origin/gh/ZhiweiYan-96/39/orig 2025-12-04T09:17:10.2048148Z * [new branch] gh/ZhiweiYan-96/44/base -> origin/gh/ZhiweiYan-96/44/base 2025-12-04T09:17:10.2049534Z * [new branch] gh/ZhiweiYan-96/44/head -> origin/gh/ZhiweiYan-96/44/head 2025-12-04T09:17:10.2052633Z * [new branch] gh/ZhiweiYan-96/45/base -> origin/gh/ZhiweiYan-96/45/base 2025-12-04T09:17:10.2054899Z * [new branch] gh/ZhiweiYan-96/45/head -> origin/gh/ZhiweiYan-96/45/head 2025-12-04T09:17:10.2057669Z * [new branch] gh/ZhiweiYan-96/49/base -> origin/gh/ZhiweiYan-96/49/base 2025-12-04T09:17:10.2060073Z * [new branch] gh/ZhiweiYan-96/49/head -> origin/gh/ZhiweiYan-96/49/head 2025-12-04T09:17:10.2062743Z * [new branch] gh/ZhiweiYan-96/62/base -> origin/gh/ZhiweiYan-96/62/base 2025-12-04T09:17:10.2065144Z * [new branch] gh/ZhiweiYan-96/62/head -> origin/gh/ZhiweiYan-96/62/head 2025-12-04T09:17:10.2068004Z * [new branch] gh/ZhiweiYan-96/66/base -> origin/gh/ZhiweiYan-96/66/base 2025-12-04T09:17:10.2070165Z * [new branch] gh/ZhiweiYan-96/66/head -> origin/gh/ZhiweiYan-96/66/head 2025-12-04T09:17:10.2073095Z * [new branch] gh/ZhiweiYan-96/67/base -> origin/gh/ZhiweiYan-96/67/base 2025-12-04T09:17:10.2075495Z * [new branch] gh/ZhiweiYan-96/67/head -> origin/gh/ZhiweiYan-96/67/head 2025-12-04T09:17:10.2078015Z * [new branch] gh/ZhiweiYan-96/68/base -> origin/gh/ZhiweiYan-96/68/base 2025-12-04T09:17:10.2080102Z * [new branch] gh/ZhiweiYan-96/68/head -> origin/gh/ZhiweiYan-96/68/head 2025-12-04T09:17:10.2082379Z * [new branch] gh/ZhiweiYan-96/68/orig -> origin/gh/ZhiweiYan-96/68/orig 2025-12-04T09:17:10.2085621Z * [new branch] gh/aakhundov/1/base -> origin/gh/aakhundov/1/base 2025-12-04T09:17:10.2087790Z * [new branch] gh/aakhundov/1/head -> origin/gh/aakhundov/1/head 2025-12-04T09:17:10.2090922Z * [new branch] gh/aakhundov/2/base -> origin/gh/aakhundov/2/base 2025-12-04T09:17:10.2093053Z * [new branch] gh/aakhundov/2/head -> origin/gh/aakhundov/2/head 2025-12-04T09:17:10.2095902Z * [new branch] gh/aditew01/openblas -> origin/gh/aditew01/openblas 2025-12-04T09:17:10.2097856Z * [new branch] gh/aditew01/sbgemm -> origin/gh/aditew01/sbgemm 2025-12-04T09:17:10.2100600Z * [new branch] gh/aditew01/vecbf16 -> origin/gh/aditew01/vecbf16 2025-12-04T09:17:10.2103642Z * [new branch] gh/albanD/4/base -> origin/gh/albanD/4/base 2025-12-04T09:17:10.2106192Z * [new branch] gh/albanD/4/head -> origin/gh/albanD/4/head 2025-12-04T09:17:10.2108158Z * [new branch] gh/albanD/4/orig -> origin/gh/albanD/4/orig 2025-12-04T09:17:10.2111568Z * [new branch] gh/alexbrauckmann/paddedtensor_faketensor_init -> origin/gh/alexbrauckmann/paddedtensor_faketensor_init 2025-12-04T09:17:10.2114988Z * [new branch] gh/alexsamardzic/12/base -> origin/gh/alexsamardzic/12/base 2025-12-04T09:17:10.2116953Z * [new branch] gh/alexsamardzic/12/head -> origin/gh/alexsamardzic/12/head 2025-12-04T09:17:10.2119510Z * [new branch] gh/alexsamardzic/12/orig -> origin/gh/alexsamardzic/12/orig 2025-12-04T09:17:10.2122102Z * [new branch] gh/alexsamardzic/14/base -> origin/gh/alexsamardzic/14/base 2025-12-04T09:17:10.2124574Z * [new branch] gh/alexsamardzic/14/head -> origin/gh/alexsamardzic/14/head 2025-12-04T09:17:10.2127661Z * [new branch] gh/alexsamardzic/14/orig -> origin/gh/alexsamardzic/14/orig 2025-12-04T09:17:10.2130293Z * [new branch] gh/alexsamardzic/15/base -> origin/gh/alexsamardzic/15/base 2025-12-04T09:17:10.2132482Z * [new branch] gh/alexsamardzic/15/head -> origin/gh/alexsamardzic/15/head 2025-12-04T09:17:10.2135136Z * [new branch] gh/alexsamardzic/15/orig -> origin/gh/alexsamardzic/15/orig 2025-12-04T09:17:10.2138295Z * [new branch] gh/amjames/18/base -> origin/gh/amjames/18/base 2025-12-04T09:17:10.2140559Z * [new branch] gh/amjames/18/head -> origin/gh/amjames/18/head 2025-12-04T09:17:10.2142838Z * [new branch] gh/amjames/18/orig -> origin/gh/amjames/18/orig 2025-12-04T09:17:10.2147663Z * [new branch] gh/andrewor14/35/base -> origin/gh/andrewor14/35/base 2025-12-04T09:17:10.2149460Z * [new branch] gh/andrewor14/35/head -> origin/gh/andrewor14/35/head 2025-12-04T09:17:10.2151641Z * [new branch] gh/andrewor14/35/orig -> origin/gh/andrewor14/35/orig 2025-12-04T09:17:10.2154919Z * [new branch] gh/andrewor14/50/base -> origin/gh/andrewor14/50/base 2025-12-04T09:17:10.2157479Z * [new branch] gh/andrewor14/50/head -> origin/gh/andrewor14/50/head 2025-12-04T09:17:10.2159784Z * [new branch] gh/andrewor14/50/orig -> origin/gh/andrewor14/50/orig 2025-12-04T09:17:10.2163359Z * [new branch] gh/andyanwang/30/base -> origin/gh/andyanwang/30/base 2025-12-04T09:17:10.2165723Z * [new branch] gh/andyanwang/30/orig -> origin/gh/andyanwang/30/orig 2025-12-04T09:17:10.2168994Z * [new branch] gh/andyanwang/31/base -> origin/gh/andyanwang/31/base 2025-12-04T09:17:10.2171419Z * [new branch] gh/andyanwang/31/orig -> origin/gh/andyanwang/31/orig 2025-12-04T09:17:10.2174882Z * [new branch] gh/andyanwang/39/base -> origin/gh/andyanwang/39/base 2025-12-04T09:17:10.2177261Z * [new branch] gh/andyanwang/39/head -> origin/gh/andyanwang/39/head 2025-12-04T09:17:10.2179494Z * [new branch] gh/andyanwang/39/orig -> origin/gh/andyanwang/39/orig 2025-12-04T09:17:10.2182459Z * [new branch] gh/andyanwang/42/base -> origin/gh/andyanwang/42/base 2025-12-04T09:17:10.2184866Z * [new branch] gh/andyanwang/42/head -> origin/gh/andyanwang/42/head 2025-12-04T09:17:10.2187026Z * [new branch] gh/andyanwang/42/orig -> origin/gh/andyanwang/42/orig 2025-12-04T09:17:10.2190314Z * [new branch] gh/andyanwang/45/base -> origin/gh/andyanwang/45/base 2025-12-04T09:17:10.2192255Z * [new branch] gh/andyanwang/45/head -> origin/gh/andyanwang/45/head 2025-12-04T09:17:10.2194815Z * [new branch] gh/andyanwang/45/orig -> origin/gh/andyanwang/45/orig 2025-12-04T09:17:10.2197924Z * [new branch] gh/angelayi/107/base -> origin/gh/angelayi/107/base 2025-12-04T09:17:10.2200469Z * [new branch] gh/angelayi/107/head -> origin/gh/angelayi/107/head 2025-12-04T09:17:10.2203206Z * [new branch] gh/angelayi/114/base -> origin/gh/angelayi/114/base 2025-12-04T09:17:10.2205648Z * [new branch] gh/angelayi/114/head -> origin/gh/angelayi/114/head 2025-12-04T09:17:10.2208103Z * [new branch] gh/angelayi/114/orig -> origin/gh/angelayi/114/orig 2025-12-04T09:17:10.2210657Z * [new branch] gh/angelayi/116/base -> origin/gh/angelayi/116/base 2025-12-04T09:17:10.2212934Z * [new branch] gh/angelayi/116/head -> origin/gh/angelayi/116/head 2025-12-04T09:17:10.2214981Z * [new branch] gh/angelayi/116/orig -> origin/gh/angelayi/116/orig 2025-12-04T09:17:10.2217928Z * [new branch] gh/angelayi/122/base -> origin/gh/angelayi/122/base 2025-12-04T09:17:10.2220345Z * [new branch] gh/angelayi/122/head -> origin/gh/angelayi/122/head 2025-12-04T09:17:10.2222232Z * [new branch] gh/angelayi/122/orig -> origin/gh/angelayi/122/orig 2025-12-04T09:17:10.2225767Z * [new branch] gh/angelayi/124/base -> origin/gh/angelayi/124/base 2025-12-04T09:17:10.2227890Z * [new branch] gh/angelayi/124/head -> origin/gh/angelayi/124/head 2025-12-04T09:17:10.2230072Z * [new branch] gh/angelayi/124/orig -> origin/gh/angelayi/124/orig 2025-12-04T09:17:10.2232839Z * [new branch] gh/angelayi/128/base -> origin/gh/angelayi/128/base 2025-12-04T09:17:10.2235109Z * [new branch] gh/angelayi/128/head -> origin/gh/angelayi/128/head 2025-12-04T09:17:10.2237481Z * [new branch] gh/angelayi/128/orig -> origin/gh/angelayi/128/orig 2025-12-04T09:17:10.2240074Z * [new branch] gh/angelayi/131/base -> origin/gh/angelayi/131/base 2025-12-04T09:17:10.2242154Z * [new branch] gh/angelayi/131/head -> origin/gh/angelayi/131/head 2025-12-04T09:17:10.2244293Z * [new branch] gh/angelayi/131/orig -> origin/gh/angelayi/131/orig 2025-12-04T09:17:10.2247444Z * [new branch] gh/angelayi/132/base -> origin/gh/angelayi/132/base 2025-12-04T09:17:10.2249689Z * [new branch] gh/angelayi/132/head -> origin/gh/angelayi/132/head 2025-12-04T09:17:10.2251900Z * [new branch] gh/angelayi/132/orig -> origin/gh/angelayi/132/orig 2025-12-04T09:17:10.2254699Z * [new branch] gh/angelayi/133/base -> origin/gh/angelayi/133/base 2025-12-04T09:17:10.2256891Z * [new branch] gh/angelayi/133/head -> origin/gh/angelayi/133/head 2025-12-04T09:17:10.2259267Z * [new branch] gh/angelayi/133/orig -> origin/gh/angelayi/133/orig 2025-12-04T09:17:10.2262241Z * [new branch] gh/angelayi/134/base -> origin/gh/angelayi/134/base 2025-12-04T09:17:10.2264837Z * [new branch] gh/angelayi/134/head -> origin/gh/angelayi/134/head 2025-12-04T09:17:10.2267015Z * [new branch] gh/angelayi/134/orig -> origin/gh/angelayi/134/orig 2025-12-04T09:17:10.2269902Z * [new branch] gh/angelayi/135/base -> origin/gh/angelayi/135/base 2025-12-04T09:17:10.2272225Z * [new branch] gh/angelayi/135/head -> origin/gh/angelayi/135/head 2025-12-04T09:17:10.2274229Z * [new branch] gh/angelayi/135/orig -> origin/gh/angelayi/135/orig 2025-12-04T09:17:10.2277450Z * [new branch] gh/angelayi/136/base -> origin/gh/angelayi/136/base 2025-12-04T09:17:10.2279284Z * [new branch] gh/angelayi/136/head -> origin/gh/angelayi/136/head 2025-12-04T09:17:10.2281382Z * [new branch] gh/angelayi/136/orig -> origin/gh/angelayi/136/orig 2025-12-04T09:17:10.2284256Z * [new branch] gh/angelayi/137/base -> origin/gh/angelayi/137/base 2025-12-04T09:17:10.2286411Z * [new branch] gh/angelayi/137/head -> origin/gh/angelayi/137/head 2025-12-04T09:17:10.2288776Z * [new branch] gh/angelayi/137/orig -> origin/gh/angelayi/137/orig 2025-12-04T09:17:10.2291536Z * [new branch] gh/angelayi/138/base -> origin/gh/angelayi/138/base 2025-12-04T09:17:10.2293759Z * [new branch] gh/angelayi/138/head -> origin/gh/angelayi/138/head 2025-12-04T09:17:10.2295723Z * [new branch] gh/angelayi/138/orig -> origin/gh/angelayi/138/orig 2025-12-04T09:17:10.2298532Z * [new branch] gh/angelayi/139/base -> origin/gh/angelayi/139/base 2025-12-04T09:17:10.2300639Z * [new branch] gh/angelayi/139/head -> origin/gh/angelayi/139/head 2025-12-04T09:17:10.2302804Z * [new branch] gh/angelayi/139/orig -> origin/gh/angelayi/139/orig 2025-12-04T09:17:10.2306230Z * [new branch] gh/angelayi/140/base -> origin/gh/angelayi/140/base 2025-12-04T09:17:10.2308138Z * [new branch] gh/angelayi/140/head -> origin/gh/angelayi/140/head 2025-12-04T09:17:10.2310339Z * [new branch] gh/angelayi/140/orig -> origin/gh/angelayi/140/orig 2025-12-04T09:17:10.2313889Z * [new branch] gh/angelayi/141/base -> origin/gh/angelayi/141/base 2025-12-04T09:17:10.2316358Z * [new branch] gh/angelayi/141/head -> origin/gh/angelayi/141/head 2025-12-04T09:17:10.2318333Z * [new branch] gh/angelayi/141/orig -> origin/gh/angelayi/141/orig 2025-12-04T09:17:10.2321631Z * [new branch] gh/angelayi/142/base -> origin/gh/angelayi/142/base 2025-12-04T09:17:10.2323530Z * [new branch] gh/angelayi/142/head -> origin/gh/angelayi/142/head 2025-12-04T09:17:10.2326281Z * [new branch] gh/angelayi/142/orig -> origin/gh/angelayi/142/orig 2025-12-04T09:17:10.2328895Z * [new branch] gh/angelayi/143/base -> origin/gh/angelayi/143/base 2025-12-04T09:17:10.2331025Z * [new branch] gh/angelayi/143/head -> origin/gh/angelayi/143/head 2025-12-04T09:17:10.2333284Z * [new branch] gh/angelayi/143/orig -> origin/gh/angelayi/143/orig 2025-12-04T09:17:10.2336156Z * [new branch] gh/angelayi/144/base -> origin/gh/angelayi/144/base 2025-12-04T09:17:10.2338361Z * [new branch] gh/angelayi/144/head -> origin/gh/angelayi/144/head 2025-12-04T09:17:10.2340517Z * [new branch] gh/angelayi/144/orig -> origin/gh/angelayi/144/orig 2025-12-04T09:17:10.2344441Z * [new branch] gh/anijain2305/753/base -> origin/gh/anijain2305/753/base 2025-12-04T09:17:10.2346766Z * [new branch] gh/anijain2305/753/head -> origin/gh/anijain2305/753/head 2025-12-04T09:17:10.2348797Z * [new branch] gh/anijain2305/753/orig -> origin/gh/anijain2305/753/orig 2025-12-04T09:17:10.2352194Z * [new branch] gh/anijain2305/810/base -> origin/gh/anijain2305/810/base 2025-12-04T09:17:10.2354067Z * [new branch] gh/anijain2305/810/head -> origin/gh/anijain2305/810/head 2025-12-04T09:17:10.2356208Z * [new branch] gh/anijain2305/810/orig -> origin/gh/anijain2305/810/orig 2025-12-04T09:17:10.2359449Z * [new branch] gh/anijain2305/854/base -> origin/gh/anijain2305/854/base 2025-12-04T09:17:10.2361677Z * [new branch] gh/anijain2305/854/head -> origin/gh/anijain2305/854/head 2025-12-04T09:17:10.2363759Z * [new branch] gh/anijain2305/854/orig -> origin/gh/anijain2305/854/orig 2025-12-04T09:17:10.2367188Z * [new branch] gh/anijain2305/864/base -> origin/gh/anijain2305/864/base 2025-12-04T09:17:10.2369137Z * [new branch] gh/anijain2305/864/head -> origin/gh/anijain2305/864/head 2025-12-04T09:17:10.2371304Z * [new branch] gh/anijain2305/864/orig -> origin/gh/anijain2305/864/orig 2025-12-04T09:17:10.2374256Z * [new branch] gh/anijain2305/870/base -> origin/gh/anijain2305/870/base 2025-12-04T09:17:10.2376441Z * [new branch] gh/anijain2305/870/head -> origin/gh/anijain2305/870/head 2025-12-04T09:17:10.2378700Z * [new branch] gh/anijain2305/870/orig -> origin/gh/anijain2305/870/orig 2025-12-04T09:17:10.2381690Z * [new branch] gh/anijain2305/873/base -> origin/gh/anijain2305/873/base 2025-12-04T09:17:10.2383519Z * [new branch] gh/anijain2305/873/head -> origin/gh/anijain2305/873/head 2025-12-04T09:17:10.2395975Z * [new branch] gh/anijain2305/873/orig -> origin/gh/anijain2305/873/orig 2025-12-04T09:17:10.2396766Z * [new branch] gh/anijain2305/894/base -> origin/gh/anijain2305/894/base 2025-12-04T09:17:10.2397418Z * [new branch] gh/anijain2305/894/head -> origin/gh/anijain2305/894/head 2025-12-04T09:17:10.2397972Z * [new branch] gh/anijain2305/894/orig -> origin/gh/anijain2305/894/orig 2025-12-04T09:17:10.2398518Z * [new branch] gh/anijain2305/895/base -> origin/gh/anijain2305/895/base 2025-12-04T09:17:10.2399244Z * [new branch] gh/anijain2305/895/head -> origin/gh/anijain2305/895/head 2025-12-04T09:17:10.2400134Z * [new branch] gh/anijain2305/895/orig -> origin/gh/anijain2305/895/orig 2025-12-04T09:17:10.2402251Z * [new branch] gh/anijain2305/910/base -> origin/gh/anijain2305/910/base 2025-12-04T09:17:10.2404425Z * [new branch] gh/anijain2305/910/head -> origin/gh/anijain2305/910/head 2025-12-04T09:17:10.2406971Z * [new branch] gh/anijain2305/910/orig -> origin/gh/anijain2305/910/orig 2025-12-04T09:17:10.2409635Z * [new branch] gh/anijain2305/919/base -> origin/gh/anijain2305/919/base 2025-12-04T09:17:10.2411826Z * [new branch] gh/anijain2305/919/head -> origin/gh/anijain2305/919/head 2025-12-04T09:17:10.2413937Z * [new branch] gh/anijain2305/919/orig -> origin/gh/anijain2305/919/orig 2025-12-04T09:17:10.2416796Z * [new branch] gh/anijain2305/922/base -> origin/gh/anijain2305/922/base 2025-12-04T09:17:10.2419324Z * [new branch] gh/anijain2305/922/head -> origin/gh/anijain2305/922/head 2025-12-04T09:17:10.2421248Z * [new branch] gh/anijain2305/922/orig -> origin/gh/anijain2305/922/orig 2025-12-04T09:17:10.2424983Z * [new branch] gh/anijain2305/932/base -> origin/gh/anijain2305/932/base 2025-12-04T09:17:10.2427010Z * [new branch] gh/anijain2305/932/head -> origin/gh/anijain2305/932/head 2025-12-04T09:17:10.2429403Z * [new branch] gh/anijain2305/932/orig -> origin/gh/anijain2305/932/orig 2025-12-04T09:17:10.2432504Z * [new branch] gh/anijain2305/940/base -> origin/gh/anijain2305/940/base 2025-12-04T09:17:10.2434478Z * [new branch] gh/anijain2305/940/head -> origin/gh/anijain2305/940/head 2025-12-04T09:17:10.2436636Z * [new branch] gh/anijain2305/940/orig -> origin/gh/anijain2305/940/orig 2025-12-04T09:17:10.2439737Z * [new branch] gh/anijain2305/941/base -> origin/gh/anijain2305/941/base 2025-12-04T09:17:10.2441908Z * [new branch] gh/anijain2305/941/head -> origin/gh/anijain2305/941/head 2025-12-04T09:17:10.2444019Z * [new branch] gh/anijain2305/941/orig -> origin/gh/anijain2305/941/orig 2025-12-04T09:17:10.2446831Z * [new branch] gh/anijain2305/942/base -> origin/gh/anijain2305/942/base 2025-12-04T09:17:10.2449097Z * [new branch] gh/anijain2305/942/head -> origin/gh/anijain2305/942/head 2025-12-04T09:17:10.2451329Z * [new branch] gh/anijain2305/942/orig -> origin/gh/anijain2305/942/orig 2025-12-04T09:17:10.2454416Z * [new branch] gh/anijain2305/943/base -> origin/gh/anijain2305/943/base 2025-12-04T09:17:10.2456835Z * [new branch] gh/anijain2305/943/head -> origin/gh/anijain2305/943/head 2025-12-04T09:17:10.2458645Z * [new branch] gh/anijain2305/943/orig -> origin/gh/anijain2305/943/orig 2025-12-04T09:17:10.2462456Z * [new branch] gh/anijain2305/944/base -> origin/gh/anijain2305/944/base 2025-12-04T09:17:10.2464611Z * [new branch] gh/anijain2305/944/head -> origin/gh/anijain2305/944/head 2025-12-04T09:17:10.2467602Z * [new branch] gh/anijain2305/944/orig -> origin/gh/anijain2305/944/orig 2025-12-04T09:17:10.2470278Z * [new branch] gh/anijain2305/945/base -> origin/gh/anijain2305/945/base 2025-12-04T09:17:10.2472622Z * [new branch] gh/anijain2305/945/head -> origin/gh/anijain2305/945/head 2025-12-04T09:17:10.2475085Z * [new branch] gh/anijain2305/945/orig -> origin/gh/anijain2305/945/orig 2025-12-04T09:17:10.2478098Z * [new branch] gh/anijain2305/946/base -> origin/gh/anijain2305/946/base 2025-12-04T09:17:10.2480376Z * [new branch] gh/anijain2305/946/head -> origin/gh/anijain2305/946/head 2025-12-04T09:17:10.2483009Z * [new branch] gh/anijain2305/946/orig -> origin/gh/anijain2305/946/orig 2025-12-04T09:17:10.2485544Z * [new branch] gh/anijain2305/947/base -> origin/gh/anijain2305/947/base 2025-12-04T09:17:10.2487882Z * [new branch] gh/anijain2305/947/head -> origin/gh/anijain2305/947/head 2025-12-04T09:17:10.2489736Z * [new branch] gh/anijain2305/947/orig -> origin/gh/anijain2305/947/orig 2025-12-04T09:17:10.2492963Z * [new branch] gh/anijain2305/948/base -> origin/gh/anijain2305/948/base 2025-12-04T09:17:10.2495072Z * [new branch] gh/anijain2305/948/head -> origin/gh/anijain2305/948/head 2025-12-04T09:17:10.2497381Z * [new branch] gh/anijain2305/948/orig -> origin/gh/anijain2305/948/orig 2025-12-04T09:17:10.2500303Z * [new branch] gh/anijain2305/949/base -> origin/gh/anijain2305/949/base 2025-12-04T09:17:10.2502646Z * [new branch] gh/anijain2305/949/head -> origin/gh/anijain2305/949/head 2025-12-04T09:17:10.2505247Z * [new branch] gh/anijain2305/949/orig -> origin/gh/anijain2305/949/orig 2025-12-04T09:17:10.2507948Z * [new branch] gh/anijain2305/950/base -> origin/gh/anijain2305/950/base 2025-12-04T09:17:10.2510464Z * [new branch] gh/anijain2305/950/head -> origin/gh/anijain2305/950/head 2025-12-04T09:17:10.2512354Z * [new branch] gh/anijain2305/950/orig -> origin/gh/anijain2305/950/orig 2025-12-04T09:17:10.2515493Z * [new branch] gh/anijain2305/951/base -> origin/gh/anijain2305/951/base 2025-12-04T09:17:10.2517935Z * [new branch] gh/anijain2305/951/head -> origin/gh/anijain2305/951/head 2025-12-04T09:17:10.2520028Z * [new branch] gh/anijain2305/951/orig -> origin/gh/anijain2305/951/orig 2025-12-04T09:17:10.2523061Z * [new branch] gh/anijain2305/952/base -> origin/gh/anijain2305/952/base 2025-12-04T09:17:10.2525590Z * [new branch] gh/anijain2305/952/head -> origin/gh/anijain2305/952/head 2025-12-04T09:17:10.2528056Z * [new branch] gh/anijain2305/952/orig -> origin/gh/anijain2305/952/orig 2025-12-04T09:17:10.2531162Z * [new branch] gh/anijain2305/953/base -> origin/gh/anijain2305/953/base 2025-12-04T09:17:10.2533445Z * [new branch] gh/anijain2305/953/head -> origin/gh/anijain2305/953/head 2025-12-04T09:17:10.2535624Z * [new branch] gh/anijain2305/953/orig -> origin/gh/anijain2305/953/orig 2025-12-04T09:17:10.2538659Z * [new branch] gh/anijain2305/954/base -> origin/gh/anijain2305/954/base 2025-12-04T09:17:10.2541087Z * [new branch] gh/anijain2305/954/head -> origin/gh/anijain2305/954/head 2025-12-04T09:17:10.2543682Z * [new branch] gh/anijain2305/954/orig -> origin/gh/anijain2305/954/orig 2025-12-04T09:17:10.2546444Z * [new branch] gh/anijain2305/955/base -> origin/gh/anijain2305/955/base 2025-12-04T09:17:10.2548649Z * [new branch] gh/anijain2305/955/head -> origin/gh/anijain2305/955/head 2025-12-04T09:17:10.2551108Z * [new branch] gh/anijain2305/955/orig -> origin/gh/anijain2305/955/orig 2025-12-04T09:17:10.2554213Z * [new branch] gh/anijain2305/956/base -> origin/gh/anijain2305/956/base 2025-12-04T09:17:10.2556378Z * [new branch] gh/anijain2305/956/head -> origin/gh/anijain2305/956/head 2025-12-04T09:17:10.2558664Z * [new branch] gh/anijain2305/956/orig -> origin/gh/anijain2305/956/orig 2025-12-04T09:17:10.2561968Z * [new branch] gh/anijain2305/957/base -> origin/gh/anijain2305/957/base 2025-12-04T09:17:10.2564056Z * [new branch] gh/anijain2305/957/head -> origin/gh/anijain2305/957/head 2025-12-04T09:17:10.2566502Z * [new branch] gh/anijain2305/957/orig -> origin/gh/anijain2305/957/orig 2025-12-04T09:17:10.2569573Z * [new branch] gh/anijain2305/958/base -> origin/gh/anijain2305/958/base 2025-12-04T09:17:10.2571883Z * [new branch] gh/anijain2305/958/head -> origin/gh/anijain2305/958/head 2025-12-04T09:17:10.2573949Z * [new branch] gh/anijain2305/958/orig -> origin/gh/anijain2305/958/orig 2025-12-04T09:17:10.2577094Z * [new branch] gh/anijain2305/959/base -> origin/gh/anijain2305/959/base 2025-12-04T09:17:10.2579292Z * [new branch] gh/anijain2305/959/head -> origin/gh/anijain2305/959/head 2025-12-04T09:17:10.2581490Z * [new branch] gh/anijain2305/959/orig -> origin/gh/anijain2305/959/orig 2025-12-04T09:17:10.2584756Z * [new branch] gh/anijain2305/960/base -> origin/gh/anijain2305/960/base 2025-12-04T09:17:10.2587055Z * [new branch] gh/anijain2305/960/head -> origin/gh/anijain2305/960/head 2025-12-04T09:17:10.2589304Z * [new branch] gh/anijain2305/960/orig -> origin/gh/anijain2305/960/orig 2025-12-04T09:17:10.2592462Z * [new branch] gh/anijain2305/961/base -> origin/gh/anijain2305/961/base 2025-12-04T09:17:10.2594670Z * [new branch] gh/anijain2305/961/head -> origin/gh/anijain2305/961/head 2025-12-04T09:17:10.2597118Z * [new branch] gh/anijain2305/961/orig -> origin/gh/anijain2305/961/orig 2025-12-04T09:17:10.2600066Z * [new branch] gh/anijain2305/962/base -> origin/gh/anijain2305/962/base 2025-12-04T09:17:10.2602253Z * [new branch] gh/anijain2305/962/head -> origin/gh/anijain2305/962/head 2025-12-04T09:17:10.2604668Z * [new branch] gh/anijain2305/962/orig -> origin/gh/anijain2305/962/orig 2025-12-04T09:17:10.2608115Z * [new branch] gh/anijain2305/963/base -> origin/gh/anijain2305/963/base 2025-12-04T09:17:10.2610578Z * [new branch] gh/anijain2305/963/head -> origin/gh/anijain2305/963/head 2025-12-04T09:17:10.2612796Z * [new branch] gh/anijain2305/963/orig -> origin/gh/anijain2305/963/orig 2025-12-04T09:17:10.2616004Z * [new branch] gh/anijain2305/964/base -> origin/gh/anijain2305/964/base 2025-12-04T09:17:10.2618331Z * [new branch] gh/anijain2305/964/head -> origin/gh/anijain2305/964/head 2025-12-04T09:17:10.2620622Z * [new branch] gh/anijain2305/964/orig -> origin/gh/anijain2305/964/orig 2025-12-04T09:17:10.2623613Z * [new branch] gh/anijain2305/965/base -> origin/gh/anijain2305/965/base 2025-12-04T09:17:10.2628951Z * [new branch] gh/anijain2305/965/head -> origin/gh/anijain2305/965/head 2025-12-04T09:17:10.2631165Z * [new branch] gh/anijain2305/965/orig -> origin/gh/anijain2305/965/orig 2025-12-04T09:17:10.2634077Z * [new branch] gh/anijain2305/966/base -> origin/gh/anijain2305/966/base 2025-12-04T09:17:10.2636175Z * [new branch] gh/anijain2305/966/head -> origin/gh/anijain2305/966/head 2025-12-04T09:17:10.2638645Z * [new branch] gh/anijain2305/966/orig -> origin/gh/anijain2305/966/orig 2025-12-04T09:17:10.2641756Z * [new branch] gh/anijain2305/967/base -> origin/gh/anijain2305/967/base 2025-12-04T09:17:10.2643843Z * [new branch] gh/anijain2305/967/head -> origin/gh/anijain2305/967/head 2025-12-04T09:17:10.2646200Z * [new branch] gh/anijain2305/967/orig -> origin/gh/anijain2305/967/orig 2025-12-04T09:17:10.2649362Z * [new branch] gh/anijain2305/968/base -> origin/gh/anijain2305/968/base 2025-12-04T09:17:10.2651734Z * [new branch] gh/anijain2305/968/head -> origin/gh/anijain2305/968/head 2025-12-04T09:17:10.2654071Z * [new branch] gh/anijain2305/968/orig -> origin/gh/anijain2305/968/orig 2025-12-04T09:17:10.2657222Z * [new branch] gh/anijain2305/969/base -> origin/gh/anijain2305/969/base 2025-12-04T09:17:10.2658770Z * [new branch] gh/anijain2305/969/head -> origin/gh/anijain2305/969/head 2025-12-04T09:17:10.2661487Z * [new branch] gh/anijain2305/969/orig -> origin/gh/anijain2305/969/orig 2025-12-04T09:17:10.2664448Z * [new branch] gh/anijain2305/970/base -> origin/gh/anijain2305/970/base 2025-12-04T09:17:10.2666755Z * [new branch] gh/anijain2305/970/head -> origin/gh/anijain2305/970/head 2025-12-04T09:17:10.2668996Z * [new branch] gh/anijain2305/970/orig -> origin/gh/anijain2305/970/orig 2025-12-04T09:17:10.2672623Z * [new branch] gh/anjali411/216/base -> origin/gh/anjali411/216/base 2025-12-04T09:17:10.2674933Z * [new branch] gh/anjali411/216/head -> origin/gh/anjali411/216/head 2025-12-04T09:17:10.2677064Z * [new branch] gh/anjali411/216/orig -> origin/gh/anjali411/216/orig 2025-12-04T09:17:10.2680919Z * [new branch] gh/anshul-si/1/base -> origin/gh/anshul-si/1/base 2025-12-04T09:17:10.2683225Z * [new branch] gh/anshul-si/1/head -> origin/gh/anshul-si/1/head 2025-12-04T09:17:10.2685994Z * [new branch] gh/anshul-si/2/base -> origin/gh/anshul-si/2/base 2025-12-04T09:17:10.2688120Z * [new branch] gh/anshul-si/2/head -> origin/gh/anshul-si/2/head 2025-12-04T09:17:10.2690944Z * [new branch] gh/anshul-si/3/base -> origin/gh/anshul-si/3/base 2025-12-04T09:17:10.2693158Z * [new branch] gh/anshul-si/3/head -> origin/gh/anshul-si/3/head 2025-12-04T09:17:10.2695921Z * [new branch] gh/anshul-si/4/base -> origin/gh/anshul-si/4/base 2025-12-04T09:17:10.2698006Z * [new branch] gh/anshul-si/4/head -> origin/gh/anshul-si/4/head 2025-12-04T09:17:10.2700771Z * [new branch] gh/anshul-si/5/base -> origin/gh/anshul-si/5/base 2025-12-04T09:17:10.2702905Z * [new branch] gh/anshul-si/5/head -> origin/gh/anshul-si/5/head 2025-12-04T09:17:10.2706231Z * [new branch] gh/anshul-si/53/base -> origin/gh/anshul-si/53/base 2025-12-04T09:17:10.2708511Z * [new branch] gh/anshul-si/53/head -> origin/gh/anshul-si/53/head 2025-12-04T09:17:10.2711774Z * [new branch] gh/anshul-si/58/base -> origin/gh/anshul-si/58/base 2025-12-04T09:17:10.2714026Z * [new branch] gh/anshul-si/58/head -> origin/gh/anshul-si/58/head 2025-12-04T09:17:10.2716642Z * [new branch] gh/anshul-si/66/base -> origin/gh/anshul-si/66/base 2025-12-04T09:17:10.2718934Z * [new branch] gh/anshul-si/66/head -> origin/gh/anshul-si/66/head 2025-12-04T09:17:10.2721137Z * [new branch] gh/anshul-si/66/orig -> origin/gh/anshul-si/66/orig 2025-12-04T09:17:10.2723923Z * [new branch] gh/anshul-si/67/base -> origin/gh/anshul-si/67/base 2025-12-04T09:17:10.2726526Z * [new branch] gh/anshul-si/67/head -> origin/gh/anshul-si/67/head 2025-12-04T09:17:10.2728646Z * [new branch] gh/anshul-si/67/orig -> origin/gh/anshul-si/67/orig 2025-12-04T09:17:10.2731773Z * [new branch] gh/anshul-si/68/base -> origin/gh/anshul-si/68/base 2025-12-04T09:17:10.2734170Z * [new branch] gh/anshul-si/68/head -> origin/gh/anshul-si/68/head 2025-12-04T09:17:10.2736247Z * [new branch] gh/anshul-si/68/orig -> origin/gh/anshul-si/68/orig 2025-12-04T09:17:10.2739617Z * [new branch] gh/anshul-si/69/base -> origin/gh/anshul-si/69/base 2025-12-04T09:17:10.2741667Z * [new branch] gh/anshul-si/69/head -> origin/gh/anshul-si/69/head 2025-12-04T09:17:10.2744036Z * [new branch] gh/anshul-si/69/orig -> origin/gh/anshul-si/69/orig 2025-12-04T09:17:10.2746951Z * [new branch] gh/anshul-si/70/base -> origin/gh/anshul-si/70/base 2025-12-04T09:17:10.2749636Z * [new branch] gh/anshul-si/70/head -> origin/gh/anshul-si/70/head 2025-12-04T09:17:10.2752106Z * [new branch] gh/anshul-si/70/orig -> origin/gh/anshul-si/70/orig 2025-12-04T09:17:10.2754840Z * [new branch] gh/anshul-si/71/base -> origin/gh/anshul-si/71/base 2025-12-04T09:17:10.2757113Z * [new branch] gh/anshul-si/71/head -> origin/gh/anshul-si/71/head 2025-12-04T09:17:10.2759256Z * [new branch] gh/anshul-si/71/orig -> origin/gh/anshul-si/71/orig 2025-12-04T09:17:10.2762585Z * [new branch] gh/anshul-si/72/base -> origin/gh/anshul-si/72/base 2025-12-04T09:17:10.2764731Z * [new branch] gh/anshul-si/72/head -> origin/gh/anshul-si/72/head 2025-12-04T09:17:10.2767081Z * [new branch] gh/anshul-si/72/orig -> origin/gh/anshul-si/72/orig 2025-12-04T09:17:10.2770297Z * [new branch] gh/anshul-si/73/base -> origin/gh/anshul-si/73/base 2025-12-04T09:17:10.2772670Z * [new branch] gh/anshul-si/73/head -> origin/gh/anshul-si/73/head 2025-12-04T09:17:10.2774974Z * [new branch] gh/anshul-si/73/orig -> origin/gh/anshul-si/73/orig 2025-12-04T09:17:10.2778448Z * [new branch] gh/aorenste/132/base -> origin/gh/aorenste/132/base 2025-12-04T09:17:10.2780598Z * [new branch] gh/aorenste/132/head -> origin/gh/aorenste/132/head 2025-12-04T09:17:10.2783837Z * [new branch] gh/aorenste/134/base -> origin/gh/aorenste/134/base 2025-12-04T09:17:10.2786188Z * [new branch] gh/aorenste/134/head -> origin/gh/aorenste/134/head 2025-12-04T09:17:10.2788505Z * [new branch] gh/aorenste/134/orig -> origin/gh/aorenste/134/orig 2025-12-04T09:17:10.2791578Z * [new branch] gh/aorenste/139/base -> origin/gh/aorenste/139/base 2025-12-04T09:17:10.2793860Z * [new branch] gh/aorenste/139/head -> origin/gh/aorenste/139/head 2025-12-04T09:17:10.2796143Z * [new branch] gh/aorenste/139/orig -> origin/gh/aorenste/139/orig 2025-12-04T09:17:10.2799319Z * [new branch] gh/aorenste/141/base -> origin/gh/aorenste/141/base 2025-12-04T09:17:10.2801301Z * [new branch] gh/aorenste/141/head -> origin/gh/aorenste/141/head 2025-12-04T09:17:10.2804533Z * [new branch] gh/aorenste/145/base -> origin/gh/aorenste/145/base 2025-12-04T09:17:10.2806786Z * [new branch] gh/aorenste/145/head -> origin/gh/aorenste/145/head 2025-12-04T09:17:10.2809163Z * [new branch] gh/aorenste/145/orig -> origin/gh/aorenste/145/orig 2025-12-04T09:17:10.2812207Z * [new branch] gh/aorenste/146/base -> origin/gh/aorenste/146/base 2025-12-04T09:17:10.2814511Z * [new branch] gh/aorenste/146/head -> origin/gh/aorenste/146/head 2025-12-04T09:17:10.2816974Z * [new branch] gh/aorenste/146/orig -> origin/gh/aorenste/146/orig 2025-12-04T09:17:10.2819899Z * [new branch] gh/aorenste/147/base -> origin/gh/aorenste/147/base 2025-12-04T09:17:10.2822183Z * [new branch] gh/aorenste/147/head -> origin/gh/aorenste/147/head 2025-12-04T09:17:10.2824815Z * [new branch] gh/aorenste/147/orig -> origin/gh/aorenste/147/orig 2025-12-04T09:17:10.2827908Z * [new branch] gh/aorenste/148/base -> origin/gh/aorenste/148/base 2025-12-04T09:17:10.2830118Z * [new branch] gh/aorenste/148/head -> origin/gh/aorenste/148/head 2025-12-04T09:17:10.2832635Z * [new branch] gh/aorenste/148/orig -> origin/gh/aorenste/148/orig 2025-12-04T09:17:10.2835398Z * [new branch] gh/aorenste/149/base -> origin/gh/aorenste/149/base 2025-12-04T09:17:10.2837625Z * [new branch] gh/aorenste/149/head -> origin/gh/aorenste/149/head 2025-12-04T09:17:10.2840061Z * [new branch] gh/aorenste/149/orig -> origin/gh/aorenste/149/orig 2025-12-04T09:17:10.2843319Z * [new branch] gh/aorenste/150/base -> origin/gh/aorenste/150/base 2025-12-04T09:17:10.2844991Z * [new branch] gh/aorenste/150/head -> origin/gh/aorenste/150/head 2025-12-04T09:17:10.2847410Z * [new branch] gh/aorenste/150/orig -> origin/gh/aorenste/150/orig 2025-12-04T09:17:10.2850307Z * [new branch] gh/aorenste/151/base -> origin/gh/aorenste/151/base 2025-12-04T09:17:10.2852496Z * [new branch] gh/aorenste/151/head -> origin/gh/aorenste/151/head 2025-12-04T09:17:10.2854802Z * [new branch] gh/aorenste/151/orig -> origin/gh/aorenste/151/orig 2025-12-04T09:17:10.2857914Z * [new branch] gh/aorenste/152/base -> origin/gh/aorenste/152/base 2025-12-04T09:17:10.2860126Z * [new branch] gh/aorenste/152/head -> origin/gh/aorenste/152/head 2025-12-04T09:17:10.2862314Z * [new branch] gh/aorenste/152/orig -> origin/gh/aorenste/152/orig 2025-12-04T09:17:10.2865253Z * [new branch] gh/aorenste/153/base -> origin/gh/aorenste/153/base 2025-12-04T09:17:10.2867617Z * [new branch] gh/aorenste/153/head -> origin/gh/aorenste/153/head 2025-12-04T09:17:10.2869701Z * [new branch] gh/aorenste/153/orig -> origin/gh/aorenste/153/orig 2025-12-04T09:17:10.2872819Z * [new branch] gh/aorenste/154/base -> origin/gh/aorenste/154/base 2025-12-04T09:17:10.2875008Z * [new branch] gh/aorenste/154/head -> origin/gh/aorenste/154/head 2025-12-04T09:17:10.2876997Z * [new branch] gh/aorenste/154/orig -> origin/gh/aorenste/154/orig 2025-12-04T09:17:10.2879978Z * [new branch] gh/aorenste/155/base -> origin/gh/aorenste/155/base 2025-12-04T09:17:10.2882247Z * [new branch] gh/aorenste/155/head -> origin/gh/aorenste/155/head 2025-12-04T09:17:10.2884470Z * [new branch] gh/aorenste/155/orig -> origin/gh/aorenste/155/orig 2025-12-04T09:17:10.2887409Z * [new branch] gh/aorenste/156/base -> origin/gh/aorenste/156/base 2025-12-04T09:17:10.2889569Z * [new branch] gh/aorenste/156/head -> origin/gh/aorenste/156/head 2025-12-04T09:17:10.2891649Z * [new branch] gh/aorenste/156/orig -> origin/gh/aorenste/156/orig 2025-12-04T09:17:10.2895200Z * [new branch] gh/aorenste/157/base -> origin/gh/aorenste/157/base 2025-12-04T09:17:10.2897501Z * [new branch] gh/aorenste/157/head -> origin/gh/aorenste/157/head 2025-12-04T09:17:10.2899536Z * [new branch] gh/aorenste/157/orig -> origin/gh/aorenste/157/orig 2025-12-04T09:17:10.2902415Z * [new branch] gh/aorenste/158/base -> origin/gh/aorenste/158/base 2025-12-04T09:17:10.2904791Z * [new branch] gh/aorenste/158/head -> origin/gh/aorenste/158/head 2025-12-04T09:17:10.2906894Z * [new branch] gh/aorenste/158/orig -> origin/gh/aorenste/158/orig 2025-12-04T09:17:10.2909764Z * [new branch] gh/aorenste/159/base -> origin/gh/aorenste/159/base 2025-12-04T09:17:10.2912023Z * [new branch] gh/aorenste/159/head -> origin/gh/aorenste/159/head 2025-12-04T09:17:10.2914182Z * [new branch] gh/aorenste/159/orig -> origin/gh/aorenste/159/orig 2025-12-04T09:17:10.2918073Z * [new branch] gh/avikchaudhuri/1/base -> origin/gh/avikchaudhuri/1/base 2025-12-04T09:17:10.2920478Z * [new branch] gh/avikchaudhuri/1/head -> origin/gh/avikchaudhuri/1/head 2025-12-04T09:17:10.2923337Z * [new branch] gh/avikchaudhuri/2/base -> origin/gh/avikchaudhuri/2/base 2025-12-04T09:17:10.2924992Z * [new branch] gh/avikchaudhuri/2/head -> origin/gh/avikchaudhuri/2/head 2025-12-04T09:17:10.2927681Z * [new branch] gh/avikchaudhuri/2/orig -> origin/gh/avikchaudhuri/2/orig 2025-12-04T09:17:10.2931582Z * [new branch] gh/bdhirsh/666/base -> origin/gh/bdhirsh/666/base 2025-12-04T09:17:10.2933592Z * [new branch] gh/bdhirsh/666/head -> origin/gh/bdhirsh/666/head 2025-12-04T09:17:10.2935877Z * [new branch] gh/bdhirsh/666/orig -> origin/gh/bdhirsh/666/orig 2025-12-04T09:17:10.2938862Z * [new branch] gh/bdhirsh/668/base -> origin/gh/bdhirsh/668/base 2025-12-04T09:17:10.2941037Z * [new branch] gh/bdhirsh/668/head -> origin/gh/bdhirsh/668/head 2025-12-04T09:17:10.2943392Z * [new branch] gh/bdhirsh/668/orig -> origin/gh/bdhirsh/668/orig 2025-12-04T09:17:10.2946513Z * [new branch] gh/bdhirsh/669/base -> origin/gh/bdhirsh/669/base 2025-12-04T09:17:10.2948726Z * [new branch] gh/bdhirsh/669/head -> origin/gh/bdhirsh/669/head 2025-12-04T09:17:10.2950896Z * [new branch] gh/bdhirsh/669/orig -> origin/gh/bdhirsh/669/orig 2025-12-04T09:17:10.2954221Z * [new branch] gh/bdhirsh/670/base -> origin/gh/bdhirsh/670/base 2025-12-04T09:17:10.2956391Z * [new branch] gh/bdhirsh/670/head -> origin/gh/bdhirsh/670/head 2025-12-04T09:17:10.2958669Z * [new branch] gh/bdhirsh/670/orig -> origin/gh/bdhirsh/670/orig 2025-12-04T09:17:10.2961638Z * [new branch] gh/bdhirsh/672/base -> origin/gh/bdhirsh/672/base 2025-12-04T09:17:10.2963813Z * [new branch] gh/bdhirsh/672/head -> origin/gh/bdhirsh/672/head 2025-12-04T09:17:10.2966028Z * [new branch] gh/bdhirsh/672/orig -> origin/gh/bdhirsh/672/orig 2025-12-04T09:17:10.2969475Z * [new branch] gh/bdhirsh/675/base -> origin/gh/bdhirsh/675/base 2025-12-04T09:17:10.2971645Z * [new branch] gh/bdhirsh/675/head -> origin/gh/bdhirsh/675/head 2025-12-04T09:17:10.2973973Z * [new branch] gh/bdhirsh/675/orig -> origin/gh/bdhirsh/675/orig 2025-12-04T09:17:10.2976950Z * [new branch] gh/bdhirsh/676/base -> origin/gh/bdhirsh/676/base 2025-12-04T09:17:10.2979223Z * [new branch] gh/bdhirsh/676/head -> origin/gh/bdhirsh/676/head 2025-12-04T09:17:10.2981450Z * [new branch] gh/bdhirsh/676/orig -> origin/gh/bdhirsh/676/orig 2025-12-04T09:17:10.2984610Z * [new branch] gh/bdhirsh/677/base -> origin/gh/bdhirsh/677/base 2025-12-04T09:17:10.2987200Z * [new branch] gh/bdhirsh/677/head -> origin/gh/bdhirsh/677/head 2025-12-04T09:17:10.2989355Z * [new branch] gh/bdhirsh/677/orig -> origin/gh/bdhirsh/677/orig 2025-12-04T09:17:10.2992706Z * [new branch] gh/bdhirsh/678/base -> origin/gh/bdhirsh/678/base 2025-12-04T09:17:10.2995126Z * [new branch] gh/bdhirsh/678/head -> origin/gh/bdhirsh/678/head 2025-12-04T09:17:10.2997196Z * [new branch] gh/bdhirsh/678/orig -> origin/gh/bdhirsh/678/orig 2025-12-04T09:17:10.3000590Z * [new branch] gh/bdhirsh/679/base -> origin/gh/bdhirsh/679/base 2025-12-04T09:17:10.3002973Z * [new branch] gh/bdhirsh/679/head -> origin/gh/bdhirsh/679/head 2025-12-04T09:17:10.3005205Z * [new branch] gh/bdhirsh/679/orig -> origin/gh/bdhirsh/679/orig 2025-12-04T09:17:10.3008128Z * [new branch] gh/bdhirsh/680/base -> origin/gh/bdhirsh/680/base 2025-12-04T09:17:10.3010634Z * [new branch] gh/bdhirsh/680/head -> origin/gh/bdhirsh/680/head 2025-12-04T09:17:10.3012932Z * [new branch] gh/bdhirsh/680/orig -> origin/gh/bdhirsh/680/orig 2025-12-04T09:17:10.3015543Z * [new branch] gh/bdhirsh/681/base -> origin/gh/bdhirsh/681/base 2025-12-04T09:17:10.3017762Z * [new branch] gh/bdhirsh/681/head -> origin/gh/bdhirsh/681/head 2025-12-04T09:17:10.3020151Z * [new branch] gh/bdhirsh/681/orig -> origin/gh/bdhirsh/681/orig 2025-12-04T09:17:10.3023668Z * [new branch] gh/benjaminglass1/101/base -> origin/gh/benjaminglass1/101/base 2025-12-04T09:17:10.3028295Z * [new branch] gh/benjaminglass1/101/head -> origin/gh/benjaminglass1/101/head 2025-12-04T09:17:10.3030073Z * [new branch] gh/benjaminglass1/101/orig -> origin/gh/benjaminglass1/101/orig 2025-12-04T09:17:10.3033200Z * [new branch] gh/benjaminglass1/102/base -> origin/gh/benjaminglass1/102/base 2025-12-04T09:17:10.3035672Z * [new branch] gh/benjaminglass1/102/head -> origin/gh/benjaminglass1/102/head 2025-12-04T09:17:10.3037797Z * [new branch] gh/benjaminglass1/102/orig -> origin/gh/benjaminglass1/102/orig 2025-12-04T09:17:10.3040937Z * [new branch] gh/benjaminglass1/106/base -> origin/gh/benjaminglass1/106/base 2025-12-04T09:17:10.3042564Z * [new branch] gh/benjaminglass1/106/head -> origin/gh/benjaminglass1/106/head 2025-12-04T09:17:10.3045268Z * [new branch] gh/benjaminglass1/106/orig -> origin/gh/benjaminglass1/106/orig 2025-12-04T09:17:10.3047880Z * [new branch] gh/benjaminglass1/107/base -> origin/gh/benjaminglass1/107/base 2025-12-04T09:17:10.3050407Z * [new branch] gh/benjaminglass1/107/head -> origin/gh/benjaminglass1/107/head 2025-12-04T09:17:10.3052369Z * [new branch] gh/benjaminglass1/107/orig -> origin/gh/benjaminglass1/107/orig 2025-12-04T09:17:10.3055262Z * [new branch] gh/benjaminglass1/108/base -> origin/gh/benjaminglass1/108/base 2025-12-04T09:17:10.3057651Z * [new branch] gh/benjaminglass1/108/head -> origin/gh/benjaminglass1/108/head 2025-12-04T09:17:10.3059562Z * [new branch] gh/benjaminglass1/108/orig -> origin/gh/benjaminglass1/108/orig 2025-12-04T09:17:10.3062936Z * [new branch] gh/benjaminglass1/109/base -> origin/gh/benjaminglass1/109/base 2025-12-04T09:17:10.3064944Z * [new branch] gh/benjaminglass1/109/head -> origin/gh/benjaminglass1/109/head 2025-12-04T09:17:10.3067123Z * [new branch] gh/benjaminglass1/109/orig -> origin/gh/benjaminglass1/109/orig 2025-12-04T09:17:10.3070040Z * [new branch] gh/benjaminglass1/97/base -> origin/gh/benjaminglass1/97/base 2025-12-04T09:17:10.3072510Z * [new branch] gh/benjaminglass1/97/head -> origin/gh/benjaminglass1/97/head 2025-12-04T09:17:10.3074435Z * [new branch] gh/benjaminglass1/97/orig -> origin/gh/benjaminglass1/97/orig 2025-12-04T09:17:10.3077942Z * [new branch] gh/bobrenjc93/570/base -> origin/gh/bobrenjc93/570/base 2025-12-04T09:17:10.3080387Z * [new branch] gh/bobrenjc93/570/head -> origin/gh/bobrenjc93/570/head 2025-12-04T09:17:10.3082401Z * [new branch] gh/bobrenjc93/570/orig -> origin/gh/bobrenjc93/570/orig 2025-12-04T09:17:10.3085166Z * [new branch] gh/bobrenjc93/604/base -> origin/gh/bobrenjc93/604/base 2025-12-04T09:17:10.3087719Z * [new branch] gh/bobrenjc93/604/head -> origin/gh/bobrenjc93/604/head 2025-12-04T09:17:10.3089700Z * [new branch] gh/bobrenjc93/604/orig -> origin/gh/bobrenjc93/604/orig 2025-12-04T09:17:10.3092658Z * [new branch] gh/bobrenjc93/638/base -> origin/gh/bobrenjc93/638/base 2025-12-04T09:17:10.3095136Z * [new branch] gh/bobrenjc93/638/head -> origin/gh/bobrenjc93/638/head 2025-12-04T09:17:10.3097064Z * [new branch] gh/bobrenjc93/638/orig -> origin/gh/bobrenjc93/638/orig 2025-12-04T09:17:10.3099881Z * [new branch] gh/bobrenjc93/653/base -> origin/gh/bobrenjc93/653/base 2025-12-04T09:17:10.3102168Z * [new branch] gh/bobrenjc93/653/head -> origin/gh/bobrenjc93/653/head 2025-12-04T09:17:10.3104777Z * [new branch] gh/bobrenjc93/653/orig -> origin/gh/bobrenjc93/653/orig 2025-12-04T09:17:10.3107665Z * [new branch] gh/bobrenjc93/654/base -> origin/gh/bobrenjc93/654/base 2025-12-04T09:17:10.3110289Z * [new branch] gh/bobrenjc93/654/head -> origin/gh/bobrenjc93/654/head 2025-12-04T09:17:10.3111840Z * [new branch] gh/bobrenjc93/654/orig -> origin/gh/bobrenjc93/654/orig 2025-12-04T09:17:10.3115283Z * [new branch] gh/bobrenjc93/657/base -> origin/gh/bobrenjc93/657/base 2025-12-04T09:17:10.3116935Z * [new branch] gh/bobrenjc93/657/head -> origin/gh/bobrenjc93/657/head 2025-12-04T09:17:10.3119234Z * [new branch] gh/bobrenjc93/657/orig -> origin/gh/bobrenjc93/657/orig 2025-12-04T09:17:10.3122535Z * [new branch] gh/bobrenjc93/672/base -> origin/gh/bobrenjc93/672/base 2025-12-04T09:17:10.3124057Z * [new branch] gh/bobrenjc93/672/head -> origin/gh/bobrenjc93/672/head 2025-12-04T09:17:10.3126918Z * [new branch] gh/bobrenjc93/672/orig -> origin/gh/bobrenjc93/672/orig 2025-12-04T09:17:10.3130051Z * [new branch] gh/bobrenjc93/679/base -> origin/gh/bobrenjc93/679/base 2025-12-04T09:17:10.3132059Z * [new branch] gh/bobrenjc93/679/head -> origin/gh/bobrenjc93/679/head 2025-12-04T09:17:10.3134227Z * [new branch] gh/bobrenjc93/679/orig -> origin/gh/bobrenjc93/679/orig 2025-12-04T09:17:10.3137168Z * [new branch] gh/bobrenjc93/680/base -> origin/gh/bobrenjc93/680/base 2025-12-04T09:17:10.3139575Z * [new branch] gh/bobrenjc93/680/head -> origin/gh/bobrenjc93/680/head 2025-12-04T09:17:10.3141656Z * [new branch] gh/bobrenjc93/680/orig -> origin/gh/bobrenjc93/680/orig 2025-12-04T09:17:10.3144985Z * [new branch] gh/bobrenjc93/681/base -> origin/gh/bobrenjc93/681/base 2025-12-04T09:17:10.3146531Z * [new branch] gh/bobrenjc93/681/head -> origin/gh/bobrenjc93/681/head 2025-12-04T09:17:10.3149250Z * [new branch] gh/bobrenjc93/681/orig -> origin/gh/bobrenjc93/681/orig 2025-12-04T09:17:10.3151780Z * [new branch] gh/bobrenjc93/682/base -> origin/gh/bobrenjc93/682/base 2025-12-04T09:17:10.3153932Z * [new branch] gh/bobrenjc93/682/head -> origin/gh/bobrenjc93/682/head 2025-12-04T09:17:10.3156493Z * [new branch] gh/bobrenjc93/682/orig -> origin/gh/bobrenjc93/682/orig 2025-12-04T09:17:10.3158946Z * [new branch] gh/bobrenjc93/683/base -> origin/gh/bobrenjc93/683/base 2025-12-04T09:17:10.3161454Z * [new branch] gh/bobrenjc93/683/head -> origin/gh/bobrenjc93/683/head 2025-12-04T09:17:10.3163381Z * [new branch] gh/bobrenjc93/683/orig -> origin/gh/bobrenjc93/683/orig 2025-12-04T09:17:10.3166579Z * [new branch] gh/bobrenjc93/684/base -> origin/gh/bobrenjc93/684/base 2025-12-04T09:17:10.3168695Z * [new branch] gh/bobrenjc93/684/head -> origin/gh/bobrenjc93/684/head 2025-12-04T09:17:10.3171420Z * [new branch] gh/bobrenjc93/684/orig -> origin/gh/bobrenjc93/684/orig 2025-12-04T09:17:10.3173866Z * [new branch] gh/bobrenjc93/685/base -> origin/gh/bobrenjc93/685/base 2025-12-04T09:17:10.3176292Z * [new branch] gh/bobrenjc93/685/head -> origin/gh/bobrenjc93/685/head 2025-12-04T09:17:10.3179064Z * [new branch] gh/bobrenjc93/685/orig -> origin/gh/bobrenjc93/685/orig 2025-12-04T09:17:10.3182025Z * [new branch] gh/bobrenjc93/686/base -> origin/gh/bobrenjc93/686/base 2025-12-04T09:17:10.3186922Z * [new branch] gh/bobrenjc93/686/head -> origin/gh/bobrenjc93/686/head 2025-12-04T09:17:10.3187668Z * [new branch] gh/bobrenjc93/686/orig -> origin/gh/bobrenjc93/686/orig 2025-12-04T09:17:10.3188829Z * [new branch] gh/bobrenjc93/687/base -> origin/gh/bobrenjc93/687/base 2025-12-04T09:17:10.3191737Z * [new branch] gh/bobrenjc93/687/head -> origin/gh/bobrenjc93/687/head 2025-12-04T09:17:10.3193931Z * [new branch] gh/bobrenjc93/687/orig -> origin/gh/bobrenjc93/687/orig 2025-12-04T09:17:10.3197175Z * [new branch] gh/bobrenjc93/688/base -> origin/gh/bobrenjc93/688/base 2025-12-04T09:17:10.3199316Z * [new branch] gh/bobrenjc93/688/head -> origin/gh/bobrenjc93/688/head 2025-12-04T09:17:10.3201454Z * [new branch] gh/bobrenjc93/688/orig -> origin/gh/bobrenjc93/688/orig 2025-12-04T09:17:10.3204725Z * [new branch] gh/bobrenjc93/689/base -> origin/gh/bobrenjc93/689/base 2025-12-04T09:17:10.3206636Z * [new branch] gh/bobrenjc93/689/head -> origin/gh/bobrenjc93/689/head 2025-12-04T09:17:10.3209087Z * [new branch] gh/bobrenjc93/689/orig -> origin/gh/bobrenjc93/689/orig 2025-12-04T09:17:10.3211763Z * [new branch] gh/bobrenjc93/690/base -> origin/gh/bobrenjc93/690/base 2025-12-04T09:17:10.3214094Z * [new branch] gh/bobrenjc93/690/head -> origin/gh/bobrenjc93/690/head 2025-12-04T09:17:10.3216122Z * [new branch] gh/bobrenjc93/690/orig -> origin/gh/bobrenjc93/690/orig 2025-12-04T09:17:10.3220077Z * [new branch] gh/bobrenjc93/691/base -> origin/gh/bobrenjc93/691/base 2025-12-04T09:17:10.3222209Z * [new branch] gh/bobrenjc93/691/head -> origin/gh/bobrenjc93/691/head 2025-12-04T09:17:10.3225352Z * [new branch] gh/bobrenjc93/691/orig -> origin/gh/bobrenjc93/691/orig 2025-12-04T09:17:10.3228909Z * [new branch] gh/bobrenjc93/692/base -> origin/gh/bobrenjc93/692/base 2025-12-04T09:17:10.3231287Z * [new branch] gh/bobrenjc93/692/head -> origin/gh/bobrenjc93/692/head 2025-12-04T09:17:10.3233179Z * [new branch] gh/bobrenjc93/692/orig -> origin/gh/bobrenjc93/692/orig 2025-12-04T09:17:10.3235987Z * [new branch] gh/bobrenjc93/693/base -> origin/gh/bobrenjc93/693/base 2025-12-04T09:17:10.3238437Z * [new branch] gh/bobrenjc93/693/head -> origin/gh/bobrenjc93/693/head 2025-12-04T09:17:10.3240402Z * [new branch] gh/bobrenjc93/693/orig -> origin/gh/bobrenjc93/693/orig 2025-12-04T09:17:10.3243428Z * [new branch] gh/bobrenjc93/694/base -> origin/gh/bobrenjc93/694/base 2025-12-04T09:17:10.3245583Z * [new branch] gh/bobrenjc93/694/head -> origin/gh/bobrenjc93/694/head 2025-12-04T09:17:10.3247719Z * [new branch] gh/bobrenjc93/694/orig -> origin/gh/bobrenjc93/694/orig 2025-12-04T09:17:10.3250906Z * [new branch] gh/bobrenjc93/695/base -> origin/gh/bobrenjc93/695/base 2025-12-04T09:17:10.3252758Z * [new branch] gh/bobrenjc93/695/head -> origin/gh/bobrenjc93/695/head 2025-12-04T09:17:10.3254954Z * [new branch] gh/bobrenjc93/695/orig -> origin/gh/bobrenjc93/695/orig 2025-12-04T09:17:10.3259229Z * [new branch] gh/c00w/23/base -> origin/gh/c00w/23/base 2025-12-04T09:17:10.3261116Z * [new branch] gh/c00w/23/head -> origin/gh/c00w/23/head 2025-12-04T09:17:10.3264394Z * [new branch] gh/c00w/53/base -> origin/gh/c00w/53/base 2025-12-04T09:17:10.3266401Z * [new branch] gh/c00w/53/head -> origin/gh/c00w/53/head 2025-12-04T09:17:10.3268671Z * [new branch] gh/c00w/53/orig -> origin/gh/c00w/53/orig 2025-12-04T09:17:10.3271323Z * [new branch] gh/c00w/54/base -> origin/gh/c00w/54/base 2025-12-04T09:17:10.3273560Z * [new branch] gh/c00w/54/head -> origin/gh/c00w/54/head 2025-12-04T09:17:10.3275925Z * [new branch] gh/c00w/54/orig -> origin/gh/c00w/54/orig 2025-12-04T09:17:10.3278797Z * [new branch] gh/c00w/56/base -> origin/gh/c00w/56/base 2025-12-04T09:17:10.3281035Z * [new branch] gh/c00w/56/head -> origin/gh/c00w/56/head 2025-12-04T09:17:10.3283185Z * [new branch] gh/c00w/56/orig -> origin/gh/c00w/56/orig 2025-12-04T09:17:10.3285908Z * [new branch] gh/c00w/57/base -> origin/gh/c00w/57/base 2025-12-04T09:17:10.3288181Z * [new branch] gh/c00w/57/head -> origin/gh/c00w/57/head 2025-12-04T09:17:10.3290476Z * [new branch] gh/c00w/57/orig -> origin/gh/c00w/57/orig 2025-12-04T09:17:10.3293247Z * [new branch] gh/c00w/58/base -> origin/gh/c00w/58/base 2025-12-04T09:17:10.3295729Z * [new branch] gh/c00w/58/head -> origin/gh/c00w/58/head 2025-12-04T09:17:10.3297308Z * [new branch] gh/c00w/58/orig -> origin/gh/c00w/58/orig 2025-12-04T09:17:10.3301183Z * [new branch] gh/clee2000/1/base -> origin/gh/clee2000/1/base 2025-12-04T09:17:10.3303815Z * [new branch] gh/clee2000/1/head -> origin/gh/clee2000/1/head 2025-12-04T09:17:10.3318087Z * [new branch] gh/clee2000/1/orig -> origin/gh/clee2000/1/orig 2025-12-04T09:17:10.3318878Z * [new branch] gh/coconutruben/1/base -> origin/gh/coconutruben/1/base 2025-12-04T09:17:10.3319633Z * [new branch] gh/coconutruben/1/head -> origin/gh/coconutruben/1/head 2025-12-04T09:17:10.3320255Z * [new branch] gh/coconutruben/55/base -> origin/gh/coconutruben/55/base 2025-12-04T09:17:10.3321024Z * [new branch] gh/coconutruben/55/head -> origin/gh/coconutruben/55/head 2025-12-04T09:17:10.3321751Z * [new branch] gh/coconutruben/55/orig -> origin/gh/coconutruben/55/orig 2025-12-04T09:17:10.3322319Z * [new branch] gh/coconutruben/57/base -> origin/gh/coconutruben/57/base 2025-12-04T09:17:10.3323011Z * [new branch] gh/coconutruben/57/head -> origin/gh/coconutruben/57/head 2025-12-04T09:17:10.3325779Z * [new branch] gh/coconutruben/57/orig -> origin/gh/coconutruben/57/orig 2025-12-04T09:17:10.3329027Z * [new branch] gh/coconutruben/70/base -> origin/gh/coconutruben/70/base 2025-12-04T09:17:10.3330887Z * [new branch] gh/coconutruben/70/head -> origin/gh/coconutruben/70/head 2025-12-04T09:17:10.3333130Z * [new branch] gh/coconutruben/70/orig -> origin/gh/coconutruben/70/orig 2025-12-04T09:17:10.3335826Z * [new branch] gh/coconutruben/71/base -> origin/gh/coconutruben/71/base 2025-12-04T09:17:10.3338405Z * [new branch] gh/coconutruben/71/head -> origin/gh/coconutruben/71/head 2025-12-04T09:17:10.3340521Z * [new branch] gh/coconutruben/71/orig -> origin/gh/coconutruben/71/orig 2025-12-04T09:17:10.3343726Z * [new branch] gh/coconutruben/72/base -> origin/gh/coconutruben/72/base 2025-12-04T09:17:10.3346306Z * [new branch] gh/coconutruben/72/head -> origin/gh/coconutruben/72/head 2025-12-04T09:17:10.3347758Z * [new branch] gh/coconutruben/72/orig -> origin/gh/coconutruben/72/orig 2025-12-04T09:17:10.3350627Z * [new branch] gh/coconutruben/73/base -> origin/gh/coconutruben/73/base 2025-12-04T09:17:10.3353121Z * [new branch] gh/coconutruben/73/head -> origin/gh/coconutruben/73/head 2025-12-04T09:17:10.3355560Z * [new branch] gh/coconutruben/73/orig -> origin/gh/coconutruben/73/orig 2025-12-04T09:17:10.3358341Z * [new branch] gh/coconutruben/74/base -> origin/gh/coconutruben/74/base 2025-12-04T09:17:10.3360509Z * [new branch] gh/coconutruben/74/head -> origin/gh/coconutruben/74/head 2025-12-04T09:17:10.3362703Z * [new branch] gh/coconutruben/74/orig -> origin/gh/coconutruben/74/orig 2025-12-04T09:17:10.3365813Z * [new branch] gh/coconutruben/79/base -> origin/gh/coconutruben/79/base 2025-12-04T09:17:10.3368145Z * [new branch] gh/coconutruben/79/head -> origin/gh/coconutruben/79/head 2025-12-04T09:17:10.3370451Z * [new branch] gh/coconutruben/79/orig -> origin/gh/coconutruben/79/orig 2025-12-04T09:17:10.3373061Z * [new branch] gh/coconutruben/80/base -> origin/gh/coconutruben/80/base 2025-12-04T09:17:10.3375616Z * [new branch] gh/coconutruben/80/head -> origin/gh/coconutruben/80/head 2025-12-04T09:17:10.3377123Z * [new branch] gh/coconutruben/80/orig -> origin/gh/coconutruben/80/orig 2025-12-04T09:17:10.3380372Z * [new branch] gh/coconutruben/82/base -> origin/gh/coconutruben/82/base 2025-12-04T09:17:10.3382388Z * [new branch] gh/coconutruben/82/head -> origin/gh/coconutruben/82/head 2025-12-04T09:17:10.3384688Z * [new branch] gh/coconutruben/82/orig -> origin/gh/coconutruben/82/orig 2025-12-04T09:17:10.3387907Z * [new branch] gh/coconutruben/83/base -> origin/gh/coconutruben/83/base 2025-12-04T09:17:10.3390391Z * [new branch] gh/coconutruben/83/head -> origin/gh/coconutruben/83/head 2025-12-04T09:17:10.3392245Z * [new branch] gh/coconutruben/83/orig -> origin/gh/coconutruben/83/orig 2025-12-04T09:17:10.3395230Z * [new branch] gh/coconutruben/84/base -> origin/gh/coconutruben/84/base 2025-12-04T09:17:10.3397486Z * [new branch] gh/coconutruben/84/head -> origin/gh/coconutruben/84/head 2025-12-04T09:17:10.3400002Z * [new branch] gh/coconutruben/84/orig -> origin/gh/coconutruben/84/orig 2025-12-04T09:17:10.3402640Z * [new branch] gh/coconutruben/85/base -> origin/gh/coconutruben/85/base 2025-12-04T09:17:10.3405241Z * [new branch] gh/coconutruben/85/head -> origin/gh/coconutruben/85/head 2025-12-04T09:17:10.3407178Z * [new branch] gh/coconutruben/85/orig -> origin/gh/coconutruben/85/orig 2025-12-04T09:17:10.3410197Z * [new branch] gh/coconutruben/86/base -> origin/gh/coconutruben/86/base 2025-12-04T09:17:10.3412310Z * [new branch] gh/coconutruben/86/head -> origin/gh/coconutruben/86/head 2025-12-04T09:17:10.3414635Z * [new branch] gh/coconutruben/86/orig -> origin/gh/coconutruben/86/orig 2025-12-04T09:17:10.3418621Z * [new branch] gh/colinchan15/1/base -> origin/gh/colinchan15/1/base 2025-12-04T09:17:10.3420154Z * [new branch] gh/colinchan15/1/head -> origin/gh/colinchan15/1/head 2025-12-04T09:17:10.3423289Z * [new branch] gh/colinchan15/2/base -> origin/gh/colinchan15/2/base 2025-12-04T09:17:10.3425834Z * [new branch] gh/colinchan15/2/head -> origin/gh/colinchan15/2/head 2025-12-04T09:17:10.3430872Z * [new branch] gh/colinchan15/3/base -> origin/gh/colinchan15/3/base 2025-12-04T09:17:10.3433031Z * [new branch] gh/colinchan15/3/head -> origin/gh/colinchan15/3/head 2025-12-04T09:17:10.3435646Z * [new branch] gh/colinchan15/6/base -> origin/gh/colinchan15/6/base 2025-12-04T09:17:10.3437740Z * [new branch] gh/colinchan15/6/head -> origin/gh/colinchan15/6/head 2025-12-04T09:17:10.3441049Z * [new branch] gh/d4l3k/1/base -> origin/gh/d4l3k/1/base 2025-12-04T09:17:10.3443241Z * [new branch] gh/d4l3k/1/head -> origin/gh/d4l3k/1/head 2025-12-04T09:17:10.3446017Z * [new branch] gh/d4l3k/2/base -> origin/gh/d4l3k/2/base 2025-12-04T09:17:10.3448151Z * [new branch] gh/d4l3k/2/head -> origin/gh/d4l3k/2/head 2025-12-04T09:17:10.3450264Z * [new branch] gh/d4l3k/2/orig -> origin/gh/d4l3k/2/orig 2025-12-04T09:17:10.3453420Z * [new branch] gh/d4l3k/3/base -> origin/gh/d4l3k/3/base 2025-12-04T09:17:10.3455254Z * [new branch] gh/d4l3k/3/head -> origin/gh/d4l3k/3/head 2025-12-04T09:17:10.3457750Z * [new branch] gh/d4l3k/3/orig -> origin/gh/d4l3k/3/orig 2025-12-04T09:17:10.3460782Z * [new branch] gh/d4l3k/4/base -> origin/gh/d4l3k/4/base 2025-12-04T09:17:10.3462774Z * [new branch] gh/d4l3k/4/head -> origin/gh/d4l3k/4/head 2025-12-04T09:17:10.3465339Z * [new branch] gh/d4l3k/4/orig -> origin/gh/d4l3k/4/orig 2025-12-04T09:17:10.3467871Z * [new branch] gh/d4l3k/5/base -> origin/gh/d4l3k/5/base 2025-12-04T09:17:10.3470067Z * [new branch] gh/d4l3k/5/orig -> origin/gh/d4l3k/5/orig 2025-12-04T09:17:10.3473712Z * [new branch] gh/davidberard98/392/base -> origin/gh/davidberard98/392/base 2025-12-04T09:17:10.3475815Z * [new branch] gh/davidberard98/392/head -> origin/gh/davidberard98/392/head 2025-12-04T09:17:10.3478151Z * [new branch] gh/davidberard98/392/orig -> origin/gh/davidberard98/392/orig 2025-12-04T09:17:10.3480977Z * [new branch] gh/davidberard98/399/base -> origin/gh/davidberard98/399/base 2025-12-04T09:17:10.3483533Z * [new branch] gh/davidberard98/399/head -> origin/gh/davidberard98/399/head 2025-12-04T09:17:10.3485962Z * [new branch] gh/davidberard98/399/orig -> origin/gh/davidberard98/399/orig 2025-12-04T09:17:10.3489621Z * [new branch] gh/desertfire/605/base -> origin/gh/desertfire/605/base 2025-12-04T09:17:10.3491533Z * [new branch] gh/desertfire/605/head -> origin/gh/desertfire/605/head 2025-12-04T09:17:10.3493685Z * [new branch] gh/desertfire/605/orig -> origin/gh/desertfire/605/orig 2025-12-04T09:17:10.3496491Z * [new branch] gh/desertfire/606/base -> origin/gh/desertfire/606/base 2025-12-04T09:17:10.3498692Z * [new branch] gh/desertfire/606/head -> origin/gh/desertfire/606/head 2025-12-04T09:17:10.3500970Z * [new branch] gh/desertfire/606/orig -> origin/gh/desertfire/606/orig 2025-12-04T09:17:10.3504233Z * [new branch] gh/desertfire/607/base -> origin/gh/desertfire/607/base 2025-12-04T09:17:10.3506116Z * [new branch] gh/desertfire/607/head -> origin/gh/desertfire/607/head 2025-12-04T09:17:10.3508285Z * [new branch] gh/desertfire/607/orig -> origin/gh/desertfire/607/orig 2025-12-04T09:17:10.3511480Z * [new branch] gh/desertfire/608/base -> origin/gh/desertfire/608/base 2025-12-04T09:17:10.3513370Z * [new branch] gh/desertfire/608/head -> origin/gh/desertfire/608/head 2025-12-04T09:17:10.3515913Z * [new branch] gh/desertfire/608/orig -> origin/gh/desertfire/608/orig 2025-12-04T09:17:10.3518472Z * [new branch] gh/desertfire/609/base -> origin/gh/desertfire/609/base 2025-12-04T09:17:10.3520719Z * [new branch] gh/desertfire/609/head -> origin/gh/desertfire/609/head 2025-12-04T09:17:10.3522870Z * [new branch] gh/desertfire/609/orig -> origin/gh/desertfire/609/orig 2025-12-04T09:17:10.3527039Z * [new branch] gh/desertfire/610/base -> origin/gh/desertfire/610/base 2025-12-04T09:17:10.3529021Z * [new branch] gh/desertfire/610/head -> origin/gh/desertfire/610/head 2025-12-04T09:17:10.3531174Z * [new branch] gh/desertfire/610/orig -> origin/gh/desertfire/610/orig 2025-12-04T09:17:10.3534159Z * [new branch] gh/desertfire/611/base -> origin/gh/desertfire/611/base 2025-12-04T09:17:10.3536391Z * [new branch] gh/desertfire/611/head -> origin/gh/desertfire/611/head 2025-12-04T09:17:10.3538523Z * [new branch] gh/desertfire/611/orig -> origin/gh/desertfire/611/orig 2025-12-04T09:17:10.3541826Z * [new branch] gh/desertfire/612/base -> origin/gh/desertfire/612/base 2025-12-04T09:17:10.3544204Z * [new branch] gh/desertfire/612/head -> origin/gh/desertfire/612/head 2025-12-04T09:17:10.3546484Z * [new branch] gh/desertfire/612/orig -> origin/gh/desertfire/612/orig 2025-12-04T09:17:10.3549132Z * [new branch] gh/desertfire/613/base -> origin/gh/desertfire/613/base 2025-12-04T09:17:10.3551786Z * [new branch] gh/desertfire/613/head -> origin/gh/desertfire/613/head 2025-12-04T09:17:10.3553737Z * [new branch] gh/desertfire/613/orig -> origin/gh/desertfire/613/orig 2025-12-04T09:17:10.3556745Z * [new branch] gh/desertfire/614/base -> origin/gh/desertfire/614/base 2025-12-04T09:17:10.3559047Z * [new branch] gh/desertfire/614/head -> origin/gh/desertfire/614/head 2025-12-04T09:17:10.3561386Z * [new branch] gh/desertfire/614/orig -> origin/gh/desertfire/614/orig 2025-12-04T09:17:10.3564532Z * [new branch] gh/desertfire/615/base -> origin/gh/desertfire/615/base 2025-12-04T09:17:10.3566702Z * [new branch] gh/desertfire/615/head -> origin/gh/desertfire/615/head 2025-12-04T09:17:10.3569331Z * [new branch] gh/desertfire/615/orig -> origin/gh/desertfire/615/orig 2025-12-04T09:17:10.3571668Z * [new branch] gh/desertfire/616/base -> origin/gh/desertfire/616/base 2025-12-04T09:17:10.3573869Z * [new branch] gh/desertfire/616/head -> origin/gh/desertfire/616/head 2025-12-04T09:17:10.3576381Z * [new branch] gh/desertfire/616/orig -> origin/gh/desertfire/616/orig 2025-12-04T09:17:10.3578681Z * [new branch] gh/desertfire/617/base -> origin/gh/desertfire/617/base 2025-12-04T09:17:10.3580930Z * [new branch] gh/desertfire/617/head -> origin/gh/desertfire/617/head 2025-12-04T09:17:10.3583027Z * [new branch] gh/desertfire/617/orig -> origin/gh/desertfire/617/orig 2025-12-04T09:17:10.3586851Z * [new branch] gh/dharakk/1/base -> origin/gh/dharakk/1/base 2025-12-04T09:17:10.3588877Z * [new branch] gh/dharakk/1/head -> origin/gh/dharakk/1/head 2025-12-04T09:17:10.3592618Z * [new branch] gh/drisspg/170/base -> origin/gh/drisspg/170/base 2025-12-04T09:17:10.3594465Z * [new branch] gh/drisspg/170/head -> origin/gh/drisspg/170/head 2025-12-04T09:17:10.3596947Z * [new branch] gh/drisspg/170/orig -> origin/gh/drisspg/170/orig 2025-12-04T09:17:10.3599517Z * [new branch] gh/drisspg/182/base -> origin/gh/drisspg/182/base 2025-12-04T09:17:10.3601709Z * [new branch] gh/drisspg/182/head -> origin/gh/drisspg/182/head 2025-12-04T09:17:10.3604413Z * [new branch] gh/drisspg/183/base -> origin/gh/drisspg/183/base 2025-12-04T09:17:10.3606384Z * [new branch] gh/drisspg/183/head -> origin/gh/drisspg/183/head 2025-12-04T09:17:10.3609019Z * [new branch] gh/drisspg/184/base -> origin/gh/drisspg/184/base 2025-12-04T09:17:10.3611110Z * [new branch] gh/drisspg/184/head -> origin/gh/drisspg/184/head 2025-12-04T09:17:10.3614085Z * [new branch] gh/drisspg/185/base -> origin/gh/drisspg/185/base 2025-12-04T09:17:10.3616191Z * [new branch] gh/drisspg/185/head -> origin/gh/drisspg/185/head 2025-12-04T09:17:10.3619328Z * [new branch] gh/drisspg/194/base -> origin/gh/drisspg/194/base 2025-12-04T09:17:10.3621226Z * [new branch] gh/drisspg/194/head -> origin/gh/drisspg/194/head 2025-12-04T09:17:10.3623463Z * [new branch] gh/drisspg/194/orig -> origin/gh/drisspg/194/orig 2025-12-04T09:17:10.3627013Z * [new branch] gh/drisspg/200/base -> origin/gh/drisspg/200/base 2025-12-04T09:17:10.3628875Z * [new branch] gh/drisspg/200/head -> origin/gh/drisspg/200/head 2025-12-04T09:17:10.3631595Z * [new branch] gh/drisspg/200/orig -> origin/gh/drisspg/200/orig 2025-12-04T09:17:10.3634013Z * [new branch] gh/drisspg/218/base -> origin/gh/drisspg/218/base 2025-12-04T09:17:10.3636141Z * [new branch] gh/drisspg/218/head -> origin/gh/drisspg/218/head 2025-12-04T09:17:10.3638648Z * [new branch] gh/drisspg/218/orig -> origin/gh/drisspg/218/orig 2025-12-04T09:17:10.3641325Z * [new branch] gh/drisspg/219/base -> origin/gh/drisspg/219/base 2025-12-04T09:17:10.3643458Z * [new branch] gh/drisspg/219/head -> origin/gh/drisspg/219/head 2025-12-04T09:17:10.3646014Z * [new branch] gh/drisspg/219/orig -> origin/gh/drisspg/219/orig 2025-12-04T09:17:10.3648529Z * [new branch] gh/drisspg/220/base -> origin/gh/drisspg/220/base 2025-12-04T09:17:10.3650668Z * [new branch] gh/drisspg/220/head -> origin/gh/drisspg/220/head 2025-12-04T09:17:10.3652797Z * [new branch] gh/drisspg/220/orig -> origin/gh/drisspg/220/orig 2025-12-04T09:17:10.3655675Z * [new branch] gh/drisspg/221/base -> origin/gh/drisspg/221/base 2025-12-04T09:17:10.3658140Z * [new branch] gh/drisspg/221/head -> origin/gh/drisspg/221/head 2025-12-04T09:17:10.3660036Z * [new branch] gh/drisspg/221/orig -> origin/gh/drisspg/221/orig 2025-12-04T09:17:10.3662966Z * [new branch] gh/drisspg/222/base -> origin/gh/drisspg/222/base 2025-12-04T09:17:10.3665326Z * [new branch] gh/drisspg/222/head -> origin/gh/drisspg/222/head 2025-12-04T09:17:10.3667589Z * [new branch] gh/drisspg/222/orig -> origin/gh/drisspg/222/orig 2025-12-04T09:17:10.3670578Z * [new branch] gh/drisspg/223/base -> origin/gh/drisspg/223/base 2025-12-04T09:17:10.3672976Z * [new branch] gh/drisspg/223/head -> origin/gh/drisspg/223/head 2025-12-04T09:17:10.3674925Z * [new branch] gh/drisspg/223/orig -> origin/gh/drisspg/223/orig 2025-12-04T09:17:10.3677904Z * [new branch] gh/drisspg/224/base -> origin/gh/drisspg/224/base 2025-12-04T09:17:10.3679996Z * [new branch] gh/drisspg/224/head -> origin/gh/drisspg/224/head 2025-12-04T09:17:10.3682092Z * [new branch] gh/drisspg/224/orig -> origin/gh/drisspg/224/orig 2025-12-04T09:17:10.3685102Z * [new branch] gh/drisspg/225/base -> origin/gh/drisspg/225/base 2025-12-04T09:17:10.3687230Z * [new branch] gh/drisspg/225/head -> origin/gh/drisspg/225/head 2025-12-04T09:17:10.3689338Z * [new branch] gh/drisspg/225/orig -> origin/gh/drisspg/225/orig 2025-12-04T09:17:10.3692281Z * [new branch] gh/drisspg/226/base -> origin/gh/drisspg/226/base 2025-12-04T09:17:10.3694295Z * [new branch] gh/drisspg/226/head -> origin/gh/drisspg/226/head 2025-12-04T09:17:10.3696578Z * [new branch] gh/drisspg/226/orig -> origin/gh/drisspg/226/orig 2025-12-04T09:17:10.3699936Z * [new branch] gh/drisspg/227/base -> origin/gh/drisspg/227/base 2025-12-04T09:17:10.3702053Z * [new branch] gh/drisspg/227/head -> origin/gh/drisspg/227/head 2025-12-04T09:17:10.3704765Z * [new branch] gh/drisspg/227/orig -> origin/gh/drisspg/227/orig 2025-12-04T09:17:10.3707407Z * [new branch] gh/drisspg/228/base -> origin/gh/drisspg/228/base 2025-12-04T09:17:10.3709914Z * [new branch] gh/drisspg/228/head -> origin/gh/drisspg/228/head 2025-12-04T09:17:10.3711849Z * [new branch] gh/drisspg/228/orig -> origin/gh/drisspg/228/orig 2025-12-04T09:17:10.3715157Z * [new branch] gh/drisspg/229/base -> origin/gh/drisspg/229/base 2025-12-04T09:17:10.3717051Z * [new branch] gh/drisspg/229/head -> origin/gh/drisspg/229/head 2025-12-04T09:17:10.3719733Z * [new branch] gh/drisspg/229/orig -> origin/gh/drisspg/229/orig 2025-12-04T09:17:10.3722386Z * [new branch] gh/drisspg/230/base -> origin/gh/drisspg/230/base 2025-12-04T09:17:10.3724632Z * [new branch] gh/drisspg/230/head -> origin/gh/drisspg/230/head 2025-12-04T09:17:10.3726921Z * [new branch] gh/drisspg/230/orig -> origin/gh/drisspg/230/orig 2025-12-04T09:17:10.3730478Z * [new branch] gh/dsjohns2/1/base -> origin/gh/dsjohns2/1/base 2025-12-04T09:17:10.3732684Z * [new branch] gh/dsjohns2/1/head -> origin/gh/dsjohns2/1/head 2025-12-04T09:17:10.3736142Z * [new branch] gh/dzmitry-huba/1/base -> origin/gh/dzmitry-huba/1/base 2025-12-04T09:17:10.3738377Z * [new branch] gh/dzmitry-huba/1/head -> origin/gh/dzmitry-huba/1/head 2025-12-04T09:17:10.3741480Z * [new branch] gh/dzmitry-huba/12/base -> origin/gh/dzmitry-huba/12/base 2025-12-04T09:17:10.3743826Z * [new branch] gh/dzmitry-huba/12/head -> origin/gh/dzmitry-huba/12/head 2025-12-04T09:17:10.3746067Z * [new branch] gh/dzmitry-huba/12/orig -> origin/gh/dzmitry-huba/12/orig 2025-12-04T09:17:10.3749017Z * [new branch] gh/dzmitry-huba/13/base -> origin/gh/dzmitry-huba/13/base 2025-12-04T09:17:10.3751226Z * [new branch] gh/dzmitry-huba/13/head -> origin/gh/dzmitry-huba/13/head 2025-12-04T09:17:10.3753417Z * [new branch] gh/dzmitry-huba/13/orig -> origin/gh/dzmitry-huba/13/orig 2025-12-04T09:17:10.3756309Z * [new branch] gh/dzmitry-huba/14/base -> origin/gh/dzmitry-huba/14/base 2025-12-04T09:17:10.3758513Z * [new branch] gh/dzmitry-huba/14/head -> origin/gh/dzmitry-huba/14/head 2025-12-04T09:17:10.3760674Z * [new branch] gh/dzmitry-huba/14/orig -> origin/gh/dzmitry-huba/14/orig 2025-12-04T09:17:10.3763630Z * [new branch] gh/dzmitry-huba/15/base -> origin/gh/dzmitry-huba/15/base 2025-12-04T09:17:10.3765835Z * [new branch] gh/dzmitry-huba/15/head -> origin/gh/dzmitry-huba/15/head 2025-12-04T09:17:10.3767946Z * [new branch] gh/dzmitry-huba/15/orig -> origin/gh/dzmitry-huba/15/orig 2025-12-04T09:17:10.3770930Z * [new branch] gh/dzmitry-huba/16/base -> origin/gh/dzmitry-huba/16/base 2025-12-04T09:17:10.3773199Z * [new branch] gh/dzmitry-huba/16/head -> origin/gh/dzmitry-huba/16/head 2025-12-04T09:17:10.3775362Z * [new branch] gh/dzmitry-huba/16/orig -> origin/gh/dzmitry-huba/16/orig 2025-12-04T09:17:10.3778292Z * [new branch] gh/dzmitry-huba/17/base -> origin/gh/dzmitry-huba/17/base 2025-12-04T09:17:10.3780475Z * [new branch] gh/dzmitry-huba/17/head -> origin/gh/dzmitry-huba/17/head 2025-12-04T09:17:10.3782839Z * [new branch] gh/dzmitry-huba/17/orig -> origin/gh/dzmitry-huba/17/orig 2025-12-04T09:17:10.3785707Z * [new branch] gh/dzmitry-huba/2/base -> origin/gh/dzmitry-huba/2/base 2025-12-04T09:17:10.3787853Z * [new branch] gh/dzmitry-huba/2/head -> origin/gh/dzmitry-huba/2/head 2025-12-04T09:17:10.3790606Z * [new branch] gh/dzmitry-huba/3/base -> origin/gh/dzmitry-huba/3/base 2025-12-04T09:17:10.3792672Z * [new branch] gh/dzmitry-huba/3/head -> origin/gh/dzmitry-huba/3/head 2025-12-04T09:17:10.3796115Z * [new branch] gh/eellison/808/base -> origin/gh/eellison/808/base 2025-12-04T09:17:10.3798362Z * [new branch] gh/eellison/808/head -> origin/gh/eellison/808/head 2025-12-04T09:17:10.3801160Z * [new branch] gh/eellison/808/orig -> origin/gh/eellison/808/orig 2025-12-04T09:17:10.3804217Z * [new branch] gh/eellison/822/base -> origin/gh/eellison/822/base 2025-12-04T09:17:10.3806559Z * [new branch] gh/eellison/822/head -> origin/gh/eellison/822/head 2025-12-04T09:17:10.3808532Z * [new branch] gh/eellison/822/orig -> origin/gh/eellison/822/orig 2025-12-04T09:17:10.3811382Z * [new branch] gh/eellison/823/base -> origin/gh/eellison/823/base 2025-12-04T09:17:10.3813672Z * [new branch] gh/eellison/823/head -> origin/gh/eellison/823/head 2025-12-04T09:17:10.3815831Z * [new branch] gh/eellison/823/orig -> origin/gh/eellison/823/orig 2025-12-04T09:17:10.3818615Z * [new branch] gh/eellison/862/base -> origin/gh/eellison/862/base 2025-12-04T09:17:10.3820786Z * [new branch] gh/eellison/862/head -> origin/gh/eellison/862/head 2025-12-04T09:17:10.3823045Z * [new branch] gh/eellison/862/orig -> origin/gh/eellison/862/orig 2025-12-04T09:17:10.3828213Z * [new branch] gh/eellison/863/base -> origin/gh/eellison/863/base 2025-12-04T09:17:10.3830351Z * [new branch] gh/eellison/863/head -> origin/gh/eellison/863/head 2025-12-04T09:17:10.3832555Z * [new branch] gh/eellison/863/orig -> origin/gh/eellison/863/orig 2025-12-04T09:17:10.3835274Z * [new branch] gh/eellison/864/base -> origin/gh/eellison/864/base 2025-12-04T09:17:10.3837517Z * [new branch] gh/eellison/864/head -> origin/gh/eellison/864/head 2025-12-04T09:17:10.3839942Z * [new branch] gh/eellison/864/orig -> origin/gh/eellison/864/orig 2025-12-04T09:17:10.3842545Z * [new branch] gh/eellison/865/base -> origin/gh/eellison/865/base 2025-12-04T09:17:10.3844787Z * [new branch] gh/eellison/865/head -> origin/gh/eellison/865/head 2025-12-04T09:17:10.3847133Z * [new branch] gh/eellison/865/orig -> origin/gh/eellison/865/orig 2025-12-04T09:17:10.3849784Z * [new branch] gh/eellison/866/base -> origin/gh/eellison/866/base 2025-12-04T09:17:10.3851945Z * [new branch] gh/eellison/866/head -> origin/gh/eellison/866/head 2025-12-04T09:17:10.3854106Z * [new branch] gh/eellison/866/orig -> origin/gh/eellison/866/orig 2025-12-04T09:17:10.3857203Z * [new branch] gh/eellison/867/base -> origin/gh/eellison/867/base 2025-12-04T09:17:10.3859234Z * [new branch] gh/eellison/867/head -> origin/gh/eellison/867/head 2025-12-04T09:17:10.3861467Z * [new branch] gh/eellison/867/orig -> origin/gh/eellison/867/orig 2025-12-04T09:17:10.3865222Z * [new branch] gh/eellison/868/base -> origin/gh/eellison/868/base 2025-12-04T09:17:10.3867693Z * [new branch] gh/eellison/868/head -> origin/gh/eellison/868/head 2025-12-04T09:17:10.3869845Z * [new branch] gh/eellison/868/orig -> origin/gh/eellison/868/orig 2025-12-04T09:17:10.3872759Z * [new branch] gh/eellison/869/base -> origin/gh/eellison/869/base 2025-12-04T09:17:10.3874930Z * [new branch] gh/eellison/869/head -> origin/gh/eellison/869/head 2025-12-04T09:17:10.3877022Z * [new branch] gh/eellison/869/orig -> origin/gh/eellison/869/orig 2025-12-04T09:17:10.3880006Z * [new branch] gh/eellison/870/base -> origin/gh/eellison/870/base 2025-12-04T09:17:10.3882382Z * [new branch] gh/eellison/870/head -> origin/gh/eellison/870/head 2025-12-04T09:17:10.3883941Z * [new branch] gh/eellison/870/orig -> origin/gh/eellison/870/orig 2025-12-04T09:17:10.3887074Z * [new branch] gh/eellison/871/base -> origin/gh/eellison/871/base 2025-12-04T09:17:10.3889112Z * [new branch] gh/eellison/871/head -> origin/gh/eellison/871/head 2025-12-04T09:17:10.3891354Z * [new branch] gh/eellison/871/orig -> origin/gh/eellison/871/orig 2025-12-04T09:17:10.3894505Z * [new branch] gh/eellison/872/base -> origin/gh/eellison/872/base 2025-12-04T09:17:10.3896448Z * [new branch] gh/eellison/872/head -> origin/gh/eellison/872/head 2025-12-04T09:17:10.3898645Z * [new branch] gh/eellison/872/orig -> origin/gh/eellison/872/orig 2025-12-04T09:17:10.3901709Z * [new branch] gh/eellison/873/base -> origin/gh/eellison/873/base 2025-12-04T09:17:10.3904047Z * [new branch] gh/eellison/873/head -> origin/gh/eellison/873/head 2025-12-04T09:17:10.3906197Z * [new branch] gh/eellison/873/orig -> origin/gh/eellison/873/orig 2025-12-04T09:17:10.3909461Z * [new branch] gh/eellison/874/base -> origin/gh/eellison/874/base 2025-12-04T09:17:10.3911668Z * [new branch] gh/eellison/874/head -> origin/gh/eellison/874/head 2025-12-04T09:17:10.3913818Z * [new branch] gh/eellison/874/orig -> origin/gh/eellison/874/orig 2025-12-04T09:17:10.3917284Z * [new branch] gh/eellison/875/base -> origin/gh/eellison/875/base 2025-12-04T09:17:10.3919496Z * [new branch] gh/eellison/875/head -> origin/gh/eellison/875/head 2025-12-04T09:17:10.3921653Z * [new branch] gh/eellison/875/orig -> origin/gh/eellison/875/orig 2025-12-04T09:17:10.3924850Z * [new branch] gh/eellison/876/base -> origin/gh/eellison/876/base 2025-12-04T09:17:10.3927071Z * [new branch] gh/eellison/876/head -> origin/gh/eellison/876/head 2025-12-04T09:17:10.3929226Z * [new branch] gh/eellison/876/orig -> origin/gh/eellison/876/orig 2025-12-04T09:17:10.3932200Z * [new branch] gh/eellison/877/base -> origin/gh/eellison/877/base 2025-12-04T09:17:10.3934423Z * [new branch] gh/eellison/877/head -> origin/gh/eellison/877/head 2025-12-04T09:17:10.3936638Z * [new branch] gh/eellison/877/orig -> origin/gh/eellison/877/orig 2025-12-04T09:17:10.3939607Z * [new branch] gh/eellison/878/base -> origin/gh/eellison/878/base 2025-12-04T09:17:10.3941667Z * [new branch] gh/eellison/878/head -> origin/gh/eellison/878/head 2025-12-04T09:17:10.3944010Z * [new branch] gh/eellison/878/orig -> origin/gh/eellison/878/orig 2025-12-04T09:17:10.3947075Z * [new branch] gh/eellison/879/base -> origin/gh/eellison/879/base 2025-12-04T09:17:10.3949205Z * [new branch] gh/eellison/879/head -> origin/gh/eellison/879/head 2025-12-04T09:17:10.3951349Z * [new branch] gh/eellison/879/orig -> origin/gh/eellison/879/orig 2025-12-04T09:17:10.3954177Z * [new branch] gh/eellison/880/base -> origin/gh/eellison/880/base 2025-12-04T09:17:10.3956387Z * [new branch] gh/eellison/880/head -> origin/gh/eellison/880/head 2025-12-04T09:17:10.3958590Z * [new branch] gh/eellison/880/orig -> origin/gh/eellison/880/orig 2025-12-04T09:17:10.3961554Z * [new branch] gh/eellison/881/base -> origin/gh/eellison/881/base 2025-12-04T09:17:10.3963843Z * [new branch] gh/eellison/881/head -> origin/gh/eellison/881/head 2025-12-04T09:17:10.3965968Z * [new branch] gh/eellison/881/orig -> origin/gh/eellison/881/orig 2025-12-04T09:17:10.3968948Z * [new branch] gh/eellison/882/base -> origin/gh/eellison/882/base 2025-12-04T09:17:10.3971125Z * [new branch] gh/eellison/882/head -> origin/gh/eellison/882/head 2025-12-04T09:17:10.3973423Z * [new branch] gh/eellison/882/orig -> origin/gh/eellison/882/orig 2025-12-04T09:17:10.3976243Z * [new branch] gh/eellison/883/base -> origin/gh/eellison/883/base 2025-12-04T09:17:10.3978350Z * [new branch] gh/eellison/883/head -> origin/gh/eellison/883/head 2025-12-04T09:17:10.3980661Z * [new branch] gh/eellison/883/orig -> origin/gh/eellison/883/orig 2025-12-04T09:17:10.3983394Z * [new branch] gh/eellison/884/base -> origin/gh/eellison/884/base 2025-12-04T09:17:10.3985604Z * [new branch] gh/eellison/884/head -> origin/gh/eellison/884/head 2025-12-04T09:17:10.3987741Z * [new branch] gh/eellison/884/orig -> origin/gh/eellison/884/orig 2025-12-04T09:17:10.3991217Z * [new branch] gh/etaf/147/base -> origin/gh/etaf/147/base 2025-12-04T09:17:10.3993481Z * [new branch] gh/etaf/147/head -> origin/gh/etaf/147/head 2025-12-04T09:17:10.3996521Z * [new branch] gh/etaf/154/base -> origin/gh/etaf/154/base 2025-12-04T09:17:10.3998792Z * [new branch] gh/etaf/154/head -> origin/gh/etaf/154/head 2025-12-04T09:17:10.4000884Z * [new branch] gh/etaf/154/orig -> origin/gh/etaf/154/orig 2025-12-04T09:17:10.4003766Z * [new branch] gh/etaf/156/base -> origin/gh/etaf/156/base 2025-12-04T09:17:10.4005964Z * [new branch] gh/etaf/156/head -> origin/gh/etaf/156/head 2025-12-04T09:17:10.4008130Z * [new branch] gh/etaf/156/orig -> origin/gh/etaf/156/orig 2025-12-04T09:17:10.4011157Z * [new branch] gh/etaf/157/base -> origin/gh/etaf/157/base 2025-12-04T09:17:10.4013327Z * [new branch] gh/etaf/157/head -> origin/gh/etaf/157/head 2025-12-04T09:17:10.4015522Z * [new branch] gh/etaf/157/orig -> origin/gh/etaf/157/orig 2025-12-04T09:17:10.4018475Z * [new branch] gh/etaf/158/base -> origin/gh/etaf/158/base 2025-12-04T09:17:10.4020798Z * [new branch] gh/etaf/158/head -> origin/gh/etaf/158/head 2025-12-04T09:17:10.4023053Z * [new branch] gh/etaf/158/orig -> origin/gh/etaf/158/orig 2025-12-04T09:17:10.4026359Z * [new branch] gh/etaf/159/base -> origin/gh/etaf/159/base 2025-12-04T09:17:10.4028542Z * [new branch] gh/etaf/159/head -> origin/gh/etaf/159/head 2025-12-04T09:17:10.4030702Z * [new branch] gh/etaf/159/orig -> origin/gh/etaf/159/orig 2025-12-04T09:17:10.4033643Z * [new branch] gh/etaf/160/base -> origin/gh/etaf/160/base 2025-12-04T09:17:10.4035855Z * [new branch] gh/etaf/160/head -> origin/gh/etaf/160/head 2025-12-04T09:17:10.4038065Z * [new branch] gh/etaf/160/orig -> origin/gh/etaf/160/orig 2025-12-04T09:17:10.4041041Z * [new branch] gh/etaf/161/base -> origin/gh/etaf/161/base 2025-12-04T09:17:10.4043321Z * [new branch] gh/etaf/161/head -> origin/gh/etaf/161/head 2025-12-04T09:17:10.4045533Z * [new branch] gh/etaf/161/orig -> origin/gh/etaf/161/orig 2025-12-04T09:17:10.4048466Z * [new branch] gh/etaf/166/base -> origin/gh/etaf/166/base 2025-12-04T09:17:10.4050833Z * [new branch] gh/etaf/166/head -> origin/gh/etaf/166/head 2025-12-04T09:17:10.4053002Z * [new branch] gh/etaf/166/orig -> origin/gh/etaf/166/orig 2025-12-04T09:17:10.4055852Z * [new branch] gh/etaf/167/base -> origin/gh/etaf/167/base 2025-12-04T09:17:10.4058044Z * [new branch] gh/etaf/167/head -> origin/gh/etaf/167/head 2025-12-04T09:17:10.4060175Z * [new branch] gh/etaf/167/orig -> origin/gh/etaf/167/orig 2025-12-04T09:17:10.4063237Z * [new branch] gh/etaf/168/base -> origin/gh/etaf/168/base 2025-12-04T09:17:10.4065550Z * [new branch] gh/etaf/168/head -> origin/gh/etaf/168/head 2025-12-04T09:17:10.4067666Z * [new branch] gh/etaf/168/orig -> origin/gh/etaf/168/orig 2025-12-04T09:17:10.4070807Z * [new branch] gh/etaf/172/base -> origin/gh/etaf/172/base 2025-12-04T09:17:10.4072836Z * [new branch] gh/etaf/172/head -> origin/gh/etaf/172/head 2025-12-04T09:17:10.4075055Z * [new branch] gh/etaf/172/orig -> origin/gh/etaf/172/orig 2025-12-04T09:17:10.4078095Z * [new branch] gh/etaf/173/base -> origin/gh/etaf/173/base 2025-12-04T09:17:10.4080524Z * [new branch] gh/etaf/173/head -> origin/gh/etaf/173/head 2025-12-04T09:17:10.4082697Z * [new branch] gh/etaf/173/orig -> origin/gh/etaf/173/orig 2025-12-04T09:17:10.4085580Z * [new branch] gh/etaf/174/base -> origin/gh/etaf/174/base 2025-12-04T09:17:10.4087722Z * [new branch] gh/etaf/174/head -> origin/gh/etaf/174/head 2025-12-04T09:17:10.4090622Z * [new branch] gh/etaf/175/base -> origin/gh/etaf/175/base 2025-12-04T09:17:10.4092794Z * [new branch] gh/etaf/175/head -> origin/gh/etaf/175/head 2025-12-04T09:17:10.4094938Z * [new branch] gh/etaf/175/orig -> origin/gh/etaf/175/orig 2025-12-04T09:17:10.4097878Z * [new branch] gh/etaf/176/base -> origin/gh/etaf/176/base 2025-12-04T09:17:10.4100159Z * [new branch] gh/etaf/176/head -> origin/gh/etaf/176/head 2025-12-04T09:17:10.4102291Z * [new branch] gh/etaf/176/orig -> origin/gh/etaf/176/orig 2025-12-04T09:17:10.4105938Z * [new branch] gh/etaf/177/base -> origin/gh/etaf/177/base 2025-12-04T09:17:10.4108182Z * [new branch] gh/etaf/177/head -> origin/gh/etaf/177/head 2025-12-04T09:17:10.4110457Z * [new branch] gh/etaf/177/orig -> origin/gh/etaf/177/orig 2025-12-04T09:17:10.4113556Z * [new branch] gh/etaf/178/base -> origin/gh/etaf/178/base 2025-12-04T09:17:10.4115875Z * [new branch] gh/etaf/178/head -> origin/gh/etaf/178/head 2025-12-04T09:17:10.4118020Z * [new branch] gh/etaf/178/orig -> origin/gh/etaf/178/orig 2025-12-04T09:17:10.4120987Z * [new branch] gh/etaf/179/base -> origin/gh/etaf/179/base 2025-12-04T09:17:10.4123146Z * [new branch] gh/etaf/179/head -> origin/gh/etaf/179/head 2025-12-04T09:17:10.4125597Z * [new branch] gh/etaf/179/orig -> origin/gh/etaf/179/orig 2025-12-04T09:17:10.4128371Z * [new branch] gh/etaf/180/base -> origin/gh/etaf/180/base 2025-12-04T09:17:10.4130548Z * [new branch] gh/etaf/180/head -> origin/gh/etaf/180/head 2025-12-04T09:17:10.4132741Z * [new branch] gh/etaf/180/orig -> origin/gh/etaf/180/orig 2025-12-04T09:17:10.4136297Z * [new branch] gh/exclamaforte/1/base -> origin/gh/exclamaforte/1/base 2025-12-04T09:17:10.4138407Z * [new branch] gh/exclamaforte/1/head -> origin/gh/exclamaforte/1/head 2025-12-04T09:17:10.4141169Z * [new branch] gh/exclamaforte/2/base -> origin/gh/exclamaforte/2/base 2025-12-04T09:17:10.4143348Z * [new branch] gh/exclamaforte/2/head -> origin/gh/exclamaforte/2/head 2025-12-04T09:17:10.4146262Z * [new branch] gh/exclamaforte/3/base -> origin/gh/exclamaforte/3/base 2025-12-04T09:17:10.4148441Z * [new branch] gh/exclamaforte/3/head -> origin/gh/exclamaforte/3/head 2025-12-04T09:17:10.4151281Z * [new branch] gh/exclamaforte/4/base -> origin/gh/exclamaforte/4/base 2025-12-04T09:17:10.4153411Z * [new branch] gh/exclamaforte/4/head -> origin/gh/exclamaforte/4/head 2025-12-04T09:17:10.4156907Z * [new branch] gh/ezyang/2374/base -> origin/gh/ezyang/2374/base 2025-12-04T09:17:10.4159134Z * [new branch] gh/ezyang/2374/head -> origin/gh/ezyang/2374/head 2025-12-04T09:17:10.4162080Z * [new branch] gh/ezyang/2374/orig -> origin/gh/ezyang/2374/orig 2025-12-04T09:17:10.4164801Z * [new branch] gh/ezyang/2973/base -> origin/gh/ezyang/2973/base 2025-12-04T09:17:10.4166961Z * [new branch] gh/ezyang/2973/head -> origin/gh/ezyang/2973/head 2025-12-04T09:17:10.4169243Z * [new branch] gh/ezyang/2973/orig -> origin/gh/ezyang/2973/orig 2025-12-04T09:17:10.4172065Z * [new branch] gh/ezyang/2974/base -> origin/gh/ezyang/2974/base 2025-12-04T09:17:10.4174281Z * [new branch] gh/ezyang/2974/head -> origin/gh/ezyang/2974/head 2025-12-04T09:17:10.4176530Z * [new branch] gh/ezyang/2974/orig -> origin/gh/ezyang/2974/orig 2025-12-04T09:17:10.4179390Z * [new branch] gh/ezyang/3131/base -> origin/gh/ezyang/3131/base 2025-12-04T09:17:10.4181527Z * [new branch] gh/ezyang/3131/head -> origin/gh/ezyang/3131/head 2025-12-04T09:17:10.4183872Z * [new branch] gh/ezyang/3131/orig -> origin/gh/ezyang/3131/orig 2025-12-04T09:17:10.4186761Z * [new branch] gh/ezyang/3139/base -> origin/gh/ezyang/3139/base 2025-12-04T09:17:10.4188829Z * [new branch] gh/ezyang/3139/head -> origin/gh/ezyang/3139/head 2025-12-04T09:17:10.4190985Z * [new branch] gh/ezyang/3139/orig -> origin/gh/ezyang/3139/orig 2025-12-04T09:17:10.4193868Z * [new branch] gh/ezyang/3140/base -> origin/gh/ezyang/3140/base 2025-12-04T09:17:10.4195986Z * [new branch] gh/ezyang/3140/head -> origin/gh/ezyang/3140/head 2025-12-04T09:17:10.4198141Z * [new branch] gh/ezyang/3140/orig -> origin/gh/ezyang/3140/orig 2025-12-04T09:17:10.4200961Z * [new branch] gh/ezyang/3143/base -> origin/gh/ezyang/3143/base 2025-12-04T09:17:10.4203131Z * [new branch] gh/ezyang/3143/head -> origin/gh/ezyang/3143/head 2025-12-04T09:17:10.4205226Z * [new branch] gh/ezyang/3143/orig -> origin/gh/ezyang/3143/orig 2025-12-04T09:17:10.4208219Z * [new branch] gh/ezyang/3144/base -> origin/gh/ezyang/3144/base 2025-12-04T09:17:10.4210443Z * [new branch] gh/ezyang/3144/head -> origin/gh/ezyang/3144/head 2025-12-04T09:17:10.4220247Z * [new branch] gh/ezyang/3144/orig -> origin/gh/ezyang/3144/orig 2025-12-04T09:17:10.4220980Z * [new branch] gh/ezyang/3167/base -> origin/gh/ezyang/3167/base 2025-12-04T09:17:10.4221601Z * [new branch] gh/ezyang/3167/head -> origin/gh/ezyang/3167/head 2025-12-04T09:17:10.4222302Z * [new branch] gh/ezyang/3167/orig -> origin/gh/ezyang/3167/orig 2025-12-04T09:17:10.4222967Z * [new branch] gh/ezyang/3173/base -> origin/gh/ezyang/3173/base 2025-12-04T09:17:10.4223942Z * [new branch] gh/ezyang/3173/head -> origin/gh/ezyang/3173/head 2025-12-04T09:17:10.4229235Z * [new branch] gh/ezyang/3173/orig -> origin/gh/ezyang/3173/orig 2025-12-04T09:17:10.4231987Z * [new branch] gh/ezyang/3175/base -> origin/gh/ezyang/3175/base 2025-12-04T09:17:10.4234080Z * [new branch] gh/ezyang/3175/head -> origin/gh/ezyang/3175/head 2025-12-04T09:17:10.4236247Z * [new branch] gh/ezyang/3175/orig -> origin/gh/ezyang/3175/orig 2025-12-04T09:17:10.4239090Z * [new branch] gh/ezyang/3182/base -> origin/gh/ezyang/3182/base 2025-12-04T09:17:10.4241184Z * [new branch] gh/ezyang/3182/head -> origin/gh/ezyang/3182/head 2025-12-04T09:17:10.4243308Z * [new branch] gh/ezyang/3182/orig -> origin/gh/ezyang/3182/orig 2025-12-04T09:17:10.4246223Z * [new branch] gh/ezyang/3185/base -> origin/gh/ezyang/3185/base 2025-12-04T09:17:10.4249066Z * [new branch] gh/ezyang/3185/head -> origin/gh/ezyang/3185/head 2025-12-04T09:17:10.4250764Z * [new branch] gh/ezyang/3185/orig -> origin/gh/ezyang/3185/orig 2025-12-04T09:17:10.4253699Z * [new branch] gh/ezyang/3189/base -> origin/gh/ezyang/3189/base 2025-12-04T09:17:10.4255901Z * [new branch] gh/ezyang/3189/head -> origin/gh/ezyang/3189/head 2025-12-04T09:17:10.4258054Z * [new branch] gh/ezyang/3189/orig -> origin/gh/ezyang/3189/orig 2025-12-04T09:17:10.4260895Z * [new branch] gh/ezyang/3191/base -> origin/gh/ezyang/3191/base 2025-12-04T09:17:10.4263114Z * [new branch] gh/ezyang/3191/head -> origin/gh/ezyang/3191/head 2025-12-04T09:17:10.4265350Z * [new branch] gh/ezyang/3191/orig -> origin/gh/ezyang/3191/orig 2025-12-04T09:17:10.4268884Z * [new branch] gh/ezyang/3192/base -> origin/gh/ezyang/3192/base 2025-12-04T09:17:10.4271020Z * [new branch] gh/ezyang/3192/head -> origin/gh/ezyang/3192/head 2025-12-04T09:17:10.4273262Z * [new branch] gh/ezyang/3192/orig -> origin/gh/ezyang/3192/orig 2025-12-04T09:17:10.4276184Z * [new branch] gh/ezyang/3193/base -> origin/gh/ezyang/3193/base 2025-12-04T09:17:10.4278347Z * [new branch] gh/ezyang/3193/head -> origin/gh/ezyang/3193/head 2025-12-04T09:17:10.4280551Z * [new branch] gh/ezyang/3193/orig -> origin/gh/ezyang/3193/orig 2025-12-04T09:17:10.4283537Z * [new branch] gh/ezyang/3194/base -> origin/gh/ezyang/3194/base 2025-12-04T09:17:10.4285683Z * [new branch] gh/ezyang/3194/head -> origin/gh/ezyang/3194/head 2025-12-04T09:17:10.4287831Z * [new branch] gh/ezyang/3194/orig -> origin/gh/ezyang/3194/orig 2025-12-04T09:17:10.4291065Z * [new branch] gh/ezyang/3195/base -> origin/gh/ezyang/3195/base 2025-12-04T09:17:10.4293210Z * [new branch] gh/ezyang/3195/head -> origin/gh/ezyang/3195/head 2025-12-04T09:17:10.4295399Z * [new branch] gh/ezyang/3195/orig -> origin/gh/ezyang/3195/orig 2025-12-04T09:17:10.4298410Z * [new branch] gh/ezyang/3196/base -> origin/gh/ezyang/3196/base 2025-12-04T09:17:10.4300536Z * [new branch] gh/ezyang/3196/head -> origin/gh/ezyang/3196/head 2025-12-04T09:17:10.4302876Z * [new branch] gh/ezyang/3196/orig -> origin/gh/ezyang/3196/orig 2025-12-04T09:17:10.4305911Z * [new branch] gh/ezyang/3197/base -> origin/gh/ezyang/3197/base 2025-12-04T09:17:10.4308037Z * [new branch] gh/ezyang/3197/head -> origin/gh/ezyang/3197/head 2025-12-04T09:17:10.4310194Z * [new branch] gh/ezyang/3197/orig -> origin/gh/ezyang/3197/orig 2025-12-04T09:17:10.4313234Z * [new branch] gh/ezyang/3198/base -> origin/gh/ezyang/3198/base 2025-12-04T09:17:10.4315438Z * [new branch] gh/ezyang/3198/head -> origin/gh/ezyang/3198/head 2025-12-04T09:17:10.4317647Z * [new branch] gh/ezyang/3198/orig -> origin/gh/ezyang/3198/orig 2025-12-04T09:17:10.4320554Z * [new branch] gh/ezyang/3199/base -> origin/gh/ezyang/3199/base 2025-12-04T09:17:10.4322667Z * [new branch] gh/ezyang/3199/head -> origin/gh/ezyang/3199/head 2025-12-04T09:17:10.4325205Z * [new branch] gh/ezyang/3199/orig -> origin/gh/ezyang/3199/orig 2025-12-04T09:17:10.4328165Z * [new branch] gh/ezyang/3200/base -> origin/gh/ezyang/3200/base 2025-12-04T09:17:10.4330296Z * [new branch] gh/ezyang/3200/head -> origin/gh/ezyang/3200/head 2025-12-04T09:17:10.4332865Z * [new branch] gh/ezyang/3200/orig -> origin/gh/ezyang/3200/orig 2025-12-04T09:17:10.4336972Z * [new branch] gh/ezyang/3201/base -> origin/gh/ezyang/3201/base 2025-12-04T09:17:10.4338168Z * [new branch] gh/ezyang/3201/head -> origin/gh/ezyang/3201/head 2025-12-04T09:17:10.4340445Z * [new branch] gh/ezyang/3201/orig -> origin/gh/ezyang/3201/orig 2025-12-04T09:17:10.4343523Z * [new branch] gh/ezyang/3202/base -> origin/gh/ezyang/3202/base 2025-12-04T09:17:10.4345713Z * [new branch] gh/ezyang/3202/head -> origin/gh/ezyang/3202/head 2025-12-04T09:17:10.4347893Z * [new branch] gh/ezyang/3202/orig -> origin/gh/ezyang/3202/orig 2025-12-04T09:17:10.4350827Z * [new branch] gh/ezyang/3203/base -> origin/gh/ezyang/3203/base 2025-12-04T09:17:10.4353414Z * [new branch] gh/ezyang/3203/head -> origin/gh/ezyang/3203/head 2025-12-04T09:17:10.4355697Z * [new branch] gh/ezyang/3203/orig -> origin/gh/ezyang/3203/orig 2025-12-04T09:17:10.4358713Z * [new branch] gh/ezyang/3204/base -> origin/gh/ezyang/3204/base 2025-12-04T09:17:10.4360860Z * [new branch] gh/ezyang/3204/head -> origin/gh/ezyang/3204/head 2025-12-04T09:17:10.4363004Z * [new branch] gh/ezyang/3204/orig -> origin/gh/ezyang/3204/orig 2025-12-04T09:17:10.4365918Z * [new branch] gh/ezyang/3205/base -> origin/gh/ezyang/3205/base 2025-12-04T09:17:10.4368082Z * [new branch] gh/ezyang/3205/head -> origin/gh/ezyang/3205/head 2025-12-04T09:17:10.4370809Z * [new branch] gh/ezyang/3205/orig -> origin/gh/ezyang/3205/orig 2025-12-04T09:17:10.4373475Z * [new branch] gh/ezyang/3206/base -> origin/gh/ezyang/3206/base 2025-12-04T09:17:10.4375684Z * [new branch] gh/ezyang/3206/head -> origin/gh/ezyang/3206/head 2025-12-04T09:17:10.4377822Z * [new branch] gh/ezyang/3206/orig -> origin/gh/ezyang/3206/orig 2025-12-04T09:17:10.4380788Z * [new branch] gh/ezyang/3207/base -> origin/gh/ezyang/3207/base 2025-12-04T09:17:10.4383030Z * [new branch] gh/ezyang/3207/head -> origin/gh/ezyang/3207/head 2025-12-04T09:17:10.4385257Z * [new branch] gh/ezyang/3207/orig -> origin/gh/ezyang/3207/orig 2025-12-04T09:17:10.4388187Z * [new branch] gh/ezyang/3208/base -> origin/gh/ezyang/3208/base 2025-12-04T09:17:10.4390366Z * [new branch] gh/ezyang/3208/head -> origin/gh/ezyang/3208/head 2025-12-04T09:17:10.4392611Z * [new branch] gh/ezyang/3208/orig -> origin/gh/ezyang/3208/orig 2025-12-04T09:17:10.4395526Z * [new branch] gh/ezyang/3209/base -> origin/gh/ezyang/3209/base 2025-12-04T09:17:10.4397714Z * [new branch] gh/ezyang/3209/head -> origin/gh/ezyang/3209/head 2025-12-04T09:17:10.4399900Z * [new branch] gh/ezyang/3209/orig -> origin/gh/ezyang/3209/orig 2025-12-04T09:17:10.4403371Z * [new branch] gh/fadara01/3/base -> origin/gh/fadara01/3/base 2025-12-04T09:17:10.4405640Z * [new branch] gh/fadara01/3/head -> origin/gh/fadara01/3/head 2025-12-04T09:17:10.4407831Z * [new branch] gh/fadara01/3/orig -> origin/gh/fadara01/3/orig 2025-12-04T09:17:10.4410744Z * [new branch] gh/fadara01/5/base -> origin/gh/fadara01/5/base 2025-12-04T09:17:10.4412894Z * [new branch] gh/fadara01/5/head -> origin/gh/fadara01/5/head 2025-12-04T09:17:10.4415044Z * [new branch] gh/fadara01/5/orig -> origin/gh/fadara01/5/orig 2025-12-04T09:17:10.4417921Z * [new branch] gh/fadara01/6/base -> origin/gh/fadara01/6/base 2025-12-04T09:17:10.4420052Z * [new branch] gh/fadara01/6/head -> origin/gh/fadara01/6/head 2025-12-04T09:17:10.4422168Z * [new branch] gh/fadara01/6/orig -> origin/gh/fadara01/6/orig 2025-12-04T09:17:10.4425441Z * [new branch] gh/fadara01/7/base -> origin/gh/fadara01/7/base 2025-12-04T09:17:10.4427541Z * [new branch] gh/fadara01/7/head -> origin/gh/fadara01/7/head 2025-12-04T09:17:10.4429735Z * [new branch] gh/fadara01/7/orig -> origin/gh/fadara01/7/orig 2025-12-04T09:17:10.4432781Z * [new branch] gh/fadara01/8/base -> origin/gh/fadara01/8/base 2025-12-04T09:17:10.4435037Z * [new branch] gh/fadara01/8/head -> origin/gh/fadara01/8/head 2025-12-04T09:17:10.4437197Z * [new branch] gh/fadara01/8/orig -> origin/gh/fadara01/8/orig 2025-12-04T09:17:10.4440040Z * [new branch] gh/fadara01/9/base -> origin/gh/fadara01/9/base 2025-12-04T09:17:10.4442254Z * [new branch] gh/fadara01/9/head -> origin/gh/fadara01/9/head 2025-12-04T09:17:10.4444482Z * [new branch] gh/fadara01/9/orig -> origin/gh/fadara01/9/orig 2025-12-04T09:17:10.4447858Z * [new branch] gh/fduwjj/182/base -> origin/gh/fduwjj/182/base 2025-12-04T09:17:10.4450498Z * [new branch] gh/fduwjj/182/head -> origin/gh/fduwjj/182/head 2025-12-04T09:17:10.4452375Z * [new branch] gh/fduwjj/182/orig -> origin/gh/fduwjj/182/orig 2025-12-04T09:17:10.4455292Z * [new branch] gh/fduwjj/211/base -> origin/gh/fduwjj/211/base 2025-12-04T09:17:10.4457453Z * [new branch] gh/fduwjj/211/head -> origin/gh/fduwjj/211/head 2025-12-04T09:17:10.4459698Z * [new branch] gh/fduwjj/211/orig -> origin/gh/fduwjj/211/orig 2025-12-04T09:17:10.4462628Z * [new branch] gh/fduwjj/212/base -> origin/gh/fduwjj/212/base 2025-12-04T09:17:10.4464845Z * [new branch] gh/fduwjj/212/head -> origin/gh/fduwjj/212/head 2025-12-04T09:17:10.4466961Z * [new branch] gh/fduwjj/212/orig -> origin/gh/fduwjj/212/orig 2025-12-04T09:17:10.4469898Z * [new branch] gh/fduwjj/213/base -> origin/gh/fduwjj/213/base 2025-12-04T09:17:10.4472076Z * [new branch] gh/fduwjj/213/head -> origin/gh/fduwjj/213/head 2025-12-04T09:17:10.4474167Z * [new branch] gh/fduwjj/213/orig -> origin/gh/fduwjj/213/orig 2025-12-04T09:17:10.4477209Z * [new branch] gh/fduwjj/226/base -> origin/gh/fduwjj/226/base 2025-12-04T09:17:10.4479344Z * [new branch] gh/fduwjj/226/head -> origin/gh/fduwjj/226/head 2025-12-04T09:17:10.4481536Z * [new branch] gh/fduwjj/226/orig -> origin/gh/fduwjj/226/orig 2025-12-04T09:17:10.4484645Z * [new branch] gh/fduwjj/229/base -> origin/gh/fduwjj/229/base 2025-12-04T09:17:10.4486814Z * [new branch] gh/fduwjj/229/head -> origin/gh/fduwjj/229/head 2025-12-04T09:17:10.4489061Z * [new branch] gh/fduwjj/229/orig -> origin/gh/fduwjj/229/orig 2025-12-04T09:17:10.4491951Z * [new branch] gh/fduwjj/233/base -> origin/gh/fduwjj/233/base 2025-12-04T09:17:10.4494090Z * [new branch] gh/fduwjj/233/head -> origin/gh/fduwjj/233/head 2025-12-04T09:17:10.4496594Z * [new branch] gh/fduwjj/233/orig -> origin/gh/fduwjj/233/orig 2025-12-04T09:17:10.4499295Z * [new branch] gh/fduwjj/234/base -> origin/gh/fduwjj/234/base 2025-12-04T09:17:10.4501408Z * [new branch] gh/fduwjj/234/head -> origin/gh/fduwjj/234/head 2025-12-04T09:17:10.4503680Z * [new branch] gh/fduwjj/234/orig -> origin/gh/fduwjj/234/orig 2025-12-04T09:17:10.4506610Z * [new branch] gh/fduwjj/235/base -> origin/gh/fduwjj/235/base 2025-12-04T09:17:10.4508831Z * [new branch] gh/fduwjj/235/head -> origin/gh/fduwjj/235/head 2025-12-04T09:17:10.4511011Z * [new branch] gh/fduwjj/235/orig -> origin/gh/fduwjj/235/orig 2025-12-04T09:17:10.4513890Z * [new branch] gh/fduwjj/236/base -> origin/gh/fduwjj/236/base 2025-12-04T09:17:10.4515981Z * [new branch] gh/fduwjj/236/head -> origin/gh/fduwjj/236/head 2025-12-04T09:17:10.4518170Z * [new branch] gh/fduwjj/236/orig -> origin/gh/fduwjj/236/orig 2025-12-04T09:17:10.4520928Z * [new branch] gh/fduwjj/237/base -> origin/gh/fduwjj/237/base 2025-12-04T09:17:10.4523067Z * [new branch] gh/fduwjj/237/head -> origin/gh/fduwjj/237/head 2025-12-04T09:17:10.4525551Z * [new branch] gh/fduwjj/237/orig -> origin/gh/fduwjj/237/orig 2025-12-04T09:17:10.4528422Z * [new branch] gh/fduwjj/238/base -> origin/gh/fduwjj/238/base 2025-12-04T09:17:10.4530664Z * [new branch] gh/fduwjj/238/head -> origin/gh/fduwjj/238/head 2025-12-04T09:17:10.4532812Z * [new branch] gh/fduwjj/238/orig -> origin/gh/fduwjj/238/orig 2025-12-04T09:17:10.4535679Z * [new branch] gh/fduwjj/239/base -> origin/gh/fduwjj/239/base 2025-12-04T09:17:10.4537897Z * [new branch] gh/fduwjj/239/head -> origin/gh/fduwjj/239/head 2025-12-04T09:17:10.4540039Z * [new branch] gh/fduwjj/239/orig -> origin/gh/fduwjj/239/orig 2025-12-04T09:17:10.4543500Z * [new branch] gh/fegin/332/base -> origin/gh/fegin/332/base 2025-12-04T09:17:10.4545822Z * [new branch] gh/fegin/332/head -> origin/gh/fegin/332/head 2025-12-04T09:17:10.4547968Z * [new branch] gh/fegin/332/orig -> origin/gh/fegin/332/orig 2025-12-04T09:17:10.4550861Z * [new branch] gh/fegin/333/base -> origin/gh/fegin/333/base 2025-12-04T09:17:10.4553036Z * [new branch] gh/fegin/333/head -> origin/gh/fegin/333/head 2025-12-04T09:17:10.4555185Z * [new branch] gh/fegin/333/orig -> origin/gh/fegin/333/orig 2025-12-04T09:17:10.4558201Z * [new branch] gh/fegin/334/base -> origin/gh/fegin/334/base 2025-12-04T09:17:10.4560393Z * [new branch] gh/fegin/334/head -> origin/gh/fegin/334/head 2025-12-04T09:17:10.4562743Z * [new branch] gh/fegin/334/orig -> origin/gh/fegin/334/orig 2025-12-04T09:17:10.4565550Z * [new branch] gh/fegin/335/base -> origin/gh/fegin/335/base 2025-12-04T09:17:10.4567695Z * [new branch] gh/fegin/335/head -> origin/gh/fegin/335/head 2025-12-04T09:17:10.4569907Z * [new branch] gh/fegin/335/orig -> origin/gh/fegin/335/orig 2025-12-04T09:17:10.4573426Z * [new branch] gh/fffrog/160/base -> origin/gh/fffrog/160/base 2025-12-04T09:17:10.4575617Z * [new branch] gh/fffrog/160/head -> origin/gh/fffrog/160/head 2025-12-04T09:17:10.4578445Z * [new branch] gh/fffrog/177/base -> origin/gh/fffrog/177/base 2025-12-04T09:17:10.4580593Z * [new branch] gh/fffrog/177/head -> origin/gh/fffrog/177/head 2025-12-04T09:17:10.4583146Z * [new branch] gh/fffrog/177/orig -> origin/gh/fffrog/177/orig 2025-12-04T09:17:10.4585870Z * [new branch] gh/fffrog/178/base -> origin/gh/fffrog/178/base 2025-12-04T09:17:10.4588054Z * [new branch] gh/fffrog/178/head -> origin/gh/fffrog/178/head 2025-12-04T09:17:10.4590254Z * [new branch] gh/fffrog/178/orig -> origin/gh/fffrog/178/orig 2025-12-04T09:17:10.4593484Z * [new branch] gh/fffrog/181/base -> origin/gh/fffrog/181/base 2025-12-04T09:17:10.4595397Z * [new branch] gh/fffrog/181/head -> origin/gh/fffrog/181/head 2025-12-04T09:17:10.4597710Z * [new branch] gh/fffrog/181/orig -> origin/gh/fffrog/181/orig 2025-12-04T09:17:10.4600646Z * [new branch] gh/fffrog/183/base -> origin/gh/fffrog/183/base 2025-12-04T09:17:10.4602684Z * [new branch] gh/fffrog/183/head -> origin/gh/fffrog/183/head 2025-12-04T09:17:10.4604860Z * [new branch] gh/fffrog/183/orig -> origin/gh/fffrog/183/orig 2025-12-04T09:17:10.4608368Z * [new branch] gh/fxdawnn/10/base -> origin/gh/fxdawnn/10/base 2025-12-04T09:17:10.4610744Z * [new branch] gh/fxdawnn/10/head -> origin/gh/fxdawnn/10/head 2025-12-04T09:17:10.4613199Z * [new branch] gh/fxdawnn/10/orig -> origin/gh/fxdawnn/10/orig 2025-12-04T09:17:10.4615853Z * [new branch] gh/fxdawnn/11/base -> origin/gh/fxdawnn/11/base 2025-12-04T09:17:10.4617934Z * [new branch] gh/fxdawnn/11/head -> origin/gh/fxdawnn/11/head 2025-12-04T09:17:10.4620167Z * [new branch] gh/fxdawnn/11/orig -> origin/gh/fxdawnn/11/orig 2025-12-04T09:17:10.4623161Z * [new branch] gh/fxdawnn/12/base -> origin/gh/fxdawnn/12/base 2025-12-04T09:17:10.4625630Z * [new branch] gh/fxdawnn/12/head -> origin/gh/fxdawnn/12/head 2025-12-04T09:17:10.4629655Z * [new branch] gh/fxdawnn/12/orig -> origin/gh/fxdawnn/12/orig 2025-12-04T09:17:10.4632621Z * [new branch] gh/fxdawnn/13/base -> origin/gh/fxdawnn/13/base 2025-12-04T09:17:10.4635352Z * [new branch] gh/fxdawnn/13/head -> origin/gh/fxdawnn/13/head 2025-12-04T09:17:10.4637296Z * [new branch] gh/fxdawnn/13/orig -> origin/gh/fxdawnn/13/orig 2025-12-04T09:17:10.4640299Z * [new branch] gh/fxdawnn/14/base -> origin/gh/fxdawnn/14/base 2025-12-04T09:17:10.4642452Z * [new branch] gh/fxdawnn/14/head -> origin/gh/fxdawnn/14/head 2025-12-04T09:17:10.4644522Z * [new branch] gh/fxdawnn/14/orig -> origin/gh/fxdawnn/14/orig 2025-12-04T09:17:10.4647466Z * [new branch] gh/fxdawnn/15/base -> origin/gh/fxdawnn/15/base 2025-12-04T09:17:10.4649586Z * [new branch] gh/fxdawnn/15/head -> origin/gh/fxdawnn/15/head 2025-12-04T09:17:10.4651763Z * [new branch] gh/fxdawnn/15/orig -> origin/gh/fxdawnn/15/orig 2025-12-04T09:17:10.4654652Z * [new branch] gh/fxdawnn/6/base -> origin/gh/fxdawnn/6/base 2025-12-04T09:17:10.4656742Z * [new branch] gh/fxdawnn/6/head -> origin/gh/fxdawnn/6/head 2025-12-04T09:17:10.4659514Z * [new branch] gh/fxdawnn/6/orig -> origin/gh/fxdawnn/6/orig 2025-12-04T09:17:10.4663070Z * [new branch] gh/fxdawnn/7/base -> origin/gh/fxdawnn/7/base 2025-12-04T09:17:10.4664904Z * [new branch] gh/fxdawnn/7/head -> origin/gh/fxdawnn/7/head 2025-12-04T09:17:10.4667023Z * [new branch] gh/fxdawnn/7/orig -> origin/gh/fxdawnn/7/orig 2025-12-04T09:17:10.4669900Z * [new branch] gh/fxdawnn/9/base -> origin/gh/fxdawnn/9/base 2025-12-04T09:17:10.4672263Z * [new branch] gh/fxdawnn/9/head -> origin/gh/fxdawnn/9/head 2025-12-04T09:17:10.4674458Z * [new branch] gh/fxdawnn/9/orig -> origin/gh/fxdawnn/9/orig 2025-12-04T09:17:10.4677908Z * [new branch] gh/galv/1/base -> origin/gh/galv/1/base 2025-12-04T09:17:10.4680363Z * [new branch] gh/galv/1/head -> origin/gh/galv/1/head 2025-12-04T09:17:10.4682448Z * [new branch] gh/galv/1/orig -> origin/gh/galv/1/orig 2025-12-04T09:17:10.4685364Z * [new branch] gh/galv/2/base -> origin/gh/galv/2/base 2025-12-04T09:17:10.4687492Z * [new branch] gh/galv/2/head -> origin/gh/galv/2/head 2025-12-04T09:17:10.4689759Z * [new branch] gh/galv/2/orig -> origin/gh/galv/2/orig 2025-12-04T09:17:10.4692857Z * [new branch] gh/galv/3/base -> origin/gh/galv/3/base 2025-12-04T09:17:10.4694873Z * [new branch] gh/galv/3/head -> origin/gh/galv/3/head 2025-12-04T09:17:10.4697210Z * [new branch] gh/galv/3/orig -> origin/gh/galv/3/orig 2025-12-04T09:17:10.4701171Z * [new branch] gh/guangyey/134/base -> origin/gh/guangyey/134/base 2025-12-04T09:17:10.4703445Z * [new branch] gh/guangyey/134/head -> origin/gh/guangyey/134/head 2025-12-04T09:17:10.4705634Z * [new branch] gh/guangyey/134/orig -> origin/gh/guangyey/134/orig 2025-12-04T09:17:10.4708509Z * [new branch] gh/guangyey/163/base -> origin/gh/guangyey/163/base 2025-12-04T09:17:10.4710648Z * [new branch] gh/guangyey/163/head -> origin/gh/guangyey/163/head 2025-12-04T09:17:10.4712795Z * [new branch] gh/guangyey/163/orig -> origin/gh/guangyey/163/orig 2025-12-04T09:17:10.4715791Z * [new branch] gh/guangyey/168/base -> origin/gh/guangyey/168/base 2025-12-04T09:17:10.4717952Z * [new branch] gh/guangyey/168/head -> origin/gh/guangyey/168/head 2025-12-04T09:17:10.4720200Z * [new branch] gh/guangyey/168/orig -> origin/gh/guangyey/168/orig 2025-12-04T09:17:10.4723049Z * [new branch] gh/guangyey/169/base -> origin/gh/guangyey/169/base 2025-12-04T09:17:10.4725254Z * [new branch] gh/guangyey/169/head -> origin/gh/guangyey/169/head 2025-12-04T09:17:10.4727563Z * [new branch] gh/guangyey/169/orig -> origin/gh/guangyey/169/orig 2025-12-04T09:17:10.4730559Z * [new branch] gh/guangyey/170/base -> origin/gh/guangyey/170/base 2025-12-04T09:17:10.4732689Z * [new branch] gh/guangyey/170/head -> origin/gh/guangyey/170/head 2025-12-04T09:17:10.4734890Z * [new branch] gh/guangyey/170/orig -> origin/gh/guangyey/170/orig 2025-12-04T09:17:10.4737715Z * [new branch] gh/guangyey/171/base -> origin/gh/guangyey/171/base 2025-12-04T09:17:10.4739842Z * [new branch] gh/guangyey/171/head -> origin/gh/guangyey/171/head 2025-12-04T09:17:10.4742015Z * [new branch] gh/guangyey/171/orig -> origin/gh/guangyey/171/orig 2025-12-04T09:17:10.4745067Z * [new branch] gh/guangyey/178/base -> origin/gh/guangyey/178/base 2025-12-04T09:17:10.4747263Z * [new branch] gh/guangyey/178/head -> origin/gh/guangyey/178/head 2025-12-04T09:17:10.4749481Z * [new branch] gh/guangyey/178/orig -> origin/gh/guangyey/178/orig 2025-12-04T09:17:10.4752371Z * [new branch] gh/guangyey/182/base -> origin/gh/guangyey/182/base 2025-12-04T09:17:10.4754542Z * [new branch] gh/guangyey/182/head -> origin/gh/guangyey/182/head 2025-12-04T09:17:10.4756688Z * [new branch] gh/guangyey/182/orig -> origin/gh/guangyey/182/orig 2025-12-04T09:17:10.4759536Z * [new branch] gh/guangyey/183/base -> origin/gh/guangyey/183/base 2025-12-04T09:17:10.4761649Z * [new branch] gh/guangyey/183/head -> origin/gh/guangyey/183/head 2025-12-04T09:17:10.4763904Z * [new branch] gh/guangyey/183/orig -> origin/gh/guangyey/183/orig 2025-12-04T09:17:10.4766791Z * [new branch] gh/guangyey/185/base -> origin/gh/guangyey/185/base 2025-12-04T09:17:10.4768952Z * [new branch] gh/guangyey/185/head -> origin/gh/guangyey/185/head 2025-12-04T09:17:10.4771112Z * [new branch] gh/guangyey/185/orig -> origin/gh/guangyey/185/orig 2025-12-04T09:17:10.4774029Z * [new branch] gh/guangyey/186/base -> origin/gh/guangyey/186/base 2025-12-04T09:17:10.4776252Z * [new branch] gh/guangyey/186/head -> origin/gh/guangyey/186/head 2025-12-04T09:17:10.4778595Z * [new branch] gh/guangyey/186/orig -> origin/gh/guangyey/186/orig 2025-12-04T09:17:10.4781347Z * [new branch] gh/guangyey/187/base -> origin/gh/guangyey/187/base 2025-12-04T09:17:10.4783640Z * [new branch] gh/guangyey/187/head -> origin/gh/guangyey/187/head 2025-12-04T09:17:10.4785895Z * [new branch] gh/guangyey/187/orig -> origin/gh/guangyey/187/orig 2025-12-04T09:17:10.4788768Z * [new branch] gh/guangyey/188/base -> origin/gh/guangyey/188/base 2025-12-04T09:17:10.4790994Z * [new branch] gh/guangyey/188/head -> origin/gh/guangyey/188/head 2025-12-04T09:17:10.4793195Z * [new branch] gh/guangyey/188/orig -> origin/gh/guangyey/188/orig 2025-12-04T09:17:10.4795938Z * [new branch] gh/guangyey/190/base -> origin/gh/guangyey/190/base 2025-12-04T09:17:10.4798112Z * [new branch] gh/guangyey/190/head -> origin/gh/guangyey/190/head 2025-12-04T09:17:10.4800230Z * [new branch] gh/guangyey/190/orig -> origin/gh/guangyey/190/orig 2025-12-04T09:17:10.4803116Z * [new branch] gh/guangyey/208/base -> origin/gh/guangyey/208/base 2025-12-04T09:17:10.4805195Z * [new branch] gh/guangyey/208/head -> origin/gh/guangyey/208/head 2025-12-04T09:17:10.4807494Z * [new branch] gh/guangyey/208/orig -> origin/gh/guangyey/208/orig 2025-12-04T09:17:10.4810429Z * [new branch] gh/guangyey/228/base -> origin/gh/guangyey/228/base 2025-12-04T09:17:10.4812648Z * [new branch] gh/guangyey/228/head -> origin/gh/guangyey/228/head 2025-12-04T09:17:10.4814861Z * [new branch] gh/guangyey/228/orig -> origin/gh/guangyey/228/orig 2025-12-04T09:17:10.4818493Z * [new branch] gh/guangyey/230/base -> origin/gh/guangyey/230/base 2025-12-04T09:17:10.4820406Z * [new branch] gh/guangyey/230/head -> origin/gh/guangyey/230/head 2025-12-04T09:17:10.4823190Z * [new branch] gh/guangyey/230/orig -> origin/gh/guangyey/230/orig 2025-12-04T09:17:10.4826480Z * [new branch] gh/guangyey/231/base -> origin/gh/guangyey/231/base 2025-12-04T09:17:10.4828689Z * [new branch] gh/guangyey/231/head -> origin/gh/guangyey/231/head 2025-12-04T09:17:10.4830842Z * [new branch] gh/guangyey/231/orig -> origin/gh/guangyey/231/orig 2025-12-04T09:17:10.4833719Z * [new branch] gh/guangyey/232/base -> origin/gh/guangyey/232/base 2025-12-04T09:17:10.4835889Z * [new branch] gh/guangyey/232/head -> origin/gh/guangyey/232/head 2025-12-04T09:17:10.4838153Z * [new branch] gh/guangyey/232/orig -> origin/gh/guangyey/232/orig 2025-12-04T09:17:10.4841078Z * [new branch] gh/guangyey/233/base -> origin/gh/guangyey/233/base 2025-12-04T09:17:10.4843193Z * [new branch] gh/guangyey/233/head -> origin/gh/guangyey/233/head 2025-12-04T09:17:10.4845490Z * [new branch] gh/guangyey/233/orig -> origin/gh/guangyey/233/orig 2025-12-04T09:17:10.4848279Z * [new branch] gh/guangyey/234/base -> origin/gh/guangyey/234/base 2025-12-04T09:17:10.4850467Z * [new branch] gh/guangyey/234/head -> origin/gh/guangyey/234/head 2025-12-04T09:17:10.4852599Z * [new branch] gh/guangyey/234/orig -> origin/gh/guangyey/234/orig 2025-12-04T09:17:10.4855496Z * [new branch] gh/guangyey/235/base -> origin/gh/guangyey/235/base 2025-12-04T09:17:10.4857615Z * [new branch] gh/guangyey/235/head -> origin/gh/guangyey/235/head 2025-12-04T09:17:10.4859729Z * [new branch] gh/guangyey/235/orig -> origin/gh/guangyey/235/orig 2025-12-04T09:17:10.4862775Z * [new branch] gh/guangyey/236/base -> origin/gh/guangyey/236/base 2025-12-04T09:17:10.4865266Z * [new branch] gh/guangyey/236/head -> origin/gh/guangyey/236/head 2025-12-04T09:17:10.4867416Z * [new branch] gh/guangyey/236/orig -> origin/gh/guangyey/236/orig 2025-12-04T09:17:10.4870334Z * [new branch] gh/guangyey/237/base -> origin/gh/guangyey/237/base 2025-12-04T09:17:10.4872465Z * [new branch] gh/guangyey/237/head -> origin/gh/guangyey/237/head 2025-12-04T09:17:10.4874656Z * [new branch] gh/guangyey/237/orig -> origin/gh/guangyey/237/orig 2025-12-04T09:17:10.4877603Z * [new branch] gh/guangyey/238/base -> origin/gh/guangyey/238/base 2025-12-04T09:17:10.4879810Z * [new branch] gh/guangyey/238/head -> origin/gh/guangyey/238/head 2025-12-04T09:17:10.4882625Z * [new branch] gh/guangyey/239/base -> origin/gh/guangyey/239/base 2025-12-04T09:17:10.4884840Z * [new branch] gh/guangyey/239/head -> origin/gh/guangyey/239/head 2025-12-04T09:17:10.4887025Z * [new branch] gh/guangyey/239/orig -> origin/gh/guangyey/239/orig 2025-12-04T09:17:10.4889935Z * [new branch] gh/guangyey/240/base -> origin/gh/guangyey/240/base 2025-12-04T09:17:10.4892070Z * [new branch] gh/guangyey/240/head -> origin/gh/guangyey/240/head 2025-12-04T09:17:10.4894193Z * [new branch] gh/guangyey/240/orig -> origin/gh/guangyey/240/orig 2025-12-04T09:17:10.4897300Z * [new branch] gh/guangyey/241/base -> origin/gh/guangyey/241/base 2025-12-04T09:17:10.4899411Z * [new branch] gh/guangyey/241/head -> origin/gh/guangyey/241/head 2025-12-04T09:17:10.4902084Z * [new branch] gh/guangyey/241/orig -> origin/gh/guangyey/241/orig 2025-12-04T09:17:10.4904940Z * [new branch] gh/guangyey/242/base -> origin/gh/guangyey/242/base 2025-12-04T09:17:10.4907034Z * [new branch] gh/guangyey/242/head -> origin/gh/guangyey/242/head 2025-12-04T09:17:10.4909724Z * [new branch] gh/guangyey/242/orig -> origin/gh/guangyey/242/orig 2025-12-04T09:17:10.4912320Z * [new branch] gh/guangyey/243/base -> origin/gh/guangyey/243/base 2025-12-04T09:17:10.4914512Z * [new branch] gh/guangyey/243/head -> origin/gh/guangyey/243/head 2025-12-04T09:17:10.4916681Z * [new branch] gh/guangyey/243/orig -> origin/gh/guangyey/243/orig 2025-12-04T09:17:10.4919735Z * [new branch] gh/guangyey/244/base -> origin/gh/guangyey/244/base 2025-12-04T09:17:10.4921947Z * [new branch] gh/guangyey/244/head -> origin/gh/guangyey/244/head 2025-12-04T09:17:10.4924094Z * [new branch] gh/guangyey/244/orig -> origin/gh/guangyey/244/orig 2025-12-04T09:17:10.4927438Z * [new branch] gh/guangyey/245/base -> origin/gh/guangyey/245/base 2025-12-04T09:17:10.4929554Z * [new branch] gh/guangyey/245/head -> origin/gh/guangyey/245/head 2025-12-04T09:17:10.4931774Z * [new branch] gh/guangyey/245/orig -> origin/gh/guangyey/245/orig 2025-12-04T09:17:10.4934687Z * [new branch] gh/guangyey/246/base -> origin/gh/guangyey/246/base 2025-12-04T09:17:10.4936832Z * [new branch] gh/guangyey/246/head -> origin/gh/guangyey/246/head 2025-12-04T09:17:10.4939018Z * [new branch] gh/guangyey/246/orig -> origin/gh/guangyey/246/orig 2025-12-04T09:17:10.4941943Z * [new branch] gh/guangyey/247/base -> origin/gh/guangyey/247/base 2025-12-04T09:17:10.4944259Z * [new branch] gh/guangyey/247/head -> origin/gh/guangyey/247/head 2025-12-04T09:17:10.4946457Z * [new branch] gh/guangyey/247/orig -> origin/gh/guangyey/247/orig 2025-12-04T09:17:10.4949542Z * [new branch] gh/guangyey/248/base -> origin/gh/guangyey/248/base 2025-12-04T09:17:10.4951880Z * [new branch] gh/guangyey/248/head -> origin/gh/guangyey/248/head 2025-12-04T09:17:10.4953938Z * [new branch] gh/guangyey/248/orig -> origin/gh/guangyey/248/orig 2025-12-04T09:17:10.4957013Z * [new branch] gh/guangyey/249/base -> origin/gh/guangyey/249/base 2025-12-04T09:17:10.4959149Z * [new branch] gh/guangyey/249/head -> origin/gh/guangyey/249/head 2025-12-04T09:17:10.4961338Z * [new branch] gh/guangyey/249/orig -> origin/gh/guangyey/249/orig 2025-12-04T09:17:10.4964277Z * [new branch] gh/guangyey/250/base -> origin/gh/guangyey/250/base 2025-12-04T09:17:10.4966416Z * [new branch] gh/guangyey/250/head -> origin/gh/guangyey/250/head 2025-12-04T09:17:10.4968620Z * [new branch] gh/guangyey/250/orig -> origin/gh/guangyey/250/orig 2025-12-04T09:17:10.4971465Z * [new branch] gh/guangyey/251/base -> origin/gh/guangyey/251/base 2025-12-04T09:17:10.4973715Z * [new branch] gh/guangyey/251/head -> origin/gh/guangyey/251/head 2025-12-04T09:17:10.4975862Z * [new branch] gh/guangyey/251/orig -> origin/gh/guangyey/251/orig 2025-12-04T09:17:10.4978753Z * [new branch] gh/guangyey/252/base -> origin/gh/guangyey/252/base 2025-12-04T09:17:10.4980923Z * [new branch] gh/guangyey/252/head -> origin/gh/guangyey/252/head 2025-12-04T09:17:10.4983128Z * [new branch] gh/guangyey/252/orig -> origin/gh/guangyey/252/orig 2025-12-04T09:17:10.4986170Z * [new branch] gh/guangyey/253/base -> origin/gh/guangyey/253/base 2025-12-04T09:17:10.4988299Z * [new branch] gh/guangyey/253/head -> origin/gh/guangyey/253/head 2025-12-04T09:17:10.4990929Z * [new branch] gh/guangyey/253/orig -> origin/gh/guangyey/253/orig 2025-12-04T09:17:10.4993621Z * [new branch] gh/guangyey/254/base -> origin/gh/guangyey/254/base 2025-12-04T09:17:10.4995790Z * [new branch] gh/guangyey/254/head -> origin/gh/guangyey/254/head 2025-12-04T09:17:10.4997949Z * [new branch] gh/guangyey/254/orig -> origin/gh/guangyey/254/orig 2025-12-04T09:17:10.5000873Z * [new branch] gh/guangyey/255/base -> origin/gh/guangyey/255/base 2025-12-04T09:17:10.5002960Z * [new branch] gh/guangyey/255/head -> origin/gh/guangyey/255/head 2025-12-04T09:17:10.5005126Z * [new branch] gh/guangyey/255/orig -> origin/gh/guangyey/255/orig 2025-12-04T09:17:10.5008788Z * [new branch] gh/guilhermeleobas/107/base -> origin/gh/guilhermeleobas/107/base 2025-12-04T09:17:10.5011197Z * [new branch] gh/guilhermeleobas/107/head -> origin/gh/guilhermeleobas/107/head 2025-12-04T09:17:10.5013533Z * [new branch] gh/guilhermeleobas/107/orig -> origin/gh/guilhermeleobas/107/orig 2025-12-04T09:17:10.5016452Z * [new branch] gh/guilhermeleobas/108/base -> origin/gh/guilhermeleobas/108/base 2025-12-04T09:17:10.5018256Z * [new branch] gh/guilhermeleobas/108/head -> origin/gh/guilhermeleobas/108/head 2025-12-04T09:17:10.5020382Z * [new branch] gh/guilhermeleobas/108/orig -> origin/gh/guilhermeleobas/108/orig 2025-12-04T09:17:10.5024166Z * [new branch] gh/guilhermeleobas/150/base -> origin/gh/guilhermeleobas/150/base 2025-12-04T09:17:10.5030815Z * [new branch] gh/guilhermeleobas/150/head -> origin/gh/guilhermeleobas/150/head 2025-12-04T09:17:10.5031387Z * [new branch] gh/guilhermeleobas/150/orig -> origin/gh/guilhermeleobas/150/orig 2025-12-04T09:17:10.5034856Z * [new branch] gh/guilhermeleobas/168/base -> origin/gh/guilhermeleobas/168/base 2025-12-04T09:17:10.5036835Z * [new branch] gh/guilhermeleobas/168/head -> origin/gh/guilhermeleobas/168/head 2025-12-04T09:17:10.5038983Z * [new branch] gh/guilhermeleobas/168/orig -> origin/gh/guilhermeleobas/168/orig 2025-12-04T09:17:10.5041981Z * [new branch] gh/guilhermeleobas/169/base -> origin/gh/guilhermeleobas/169/base 2025-12-04T09:17:10.5044010Z * [new branch] gh/guilhermeleobas/169/head -> origin/gh/guilhermeleobas/169/head 2025-12-04T09:17:10.5046248Z * [new branch] gh/guilhermeleobas/169/orig -> origin/gh/guilhermeleobas/169/orig 2025-12-04T09:17:10.5049102Z * [new branch] gh/guilhermeleobas/170/base -> origin/gh/guilhermeleobas/170/base 2025-12-04T09:17:10.5051320Z * [new branch] gh/guilhermeleobas/170/head -> origin/gh/guilhermeleobas/170/head 2025-12-04T09:17:10.5054182Z * [new branch] gh/guilhermeleobas/170/orig -> origin/gh/guilhermeleobas/170/orig 2025-12-04T09:17:10.5056899Z * [new branch] gh/guilhermeleobas/171/base -> origin/gh/guilhermeleobas/171/base 2025-12-04T09:17:10.5058992Z * [new branch] gh/guilhermeleobas/171/head -> origin/gh/guilhermeleobas/171/head 2025-12-04T09:17:10.5061181Z * [new branch] gh/guilhermeleobas/171/orig -> origin/gh/guilhermeleobas/171/orig 2025-12-04T09:17:10.5064203Z * [new branch] gh/guilhermeleobas/173/base -> origin/gh/guilhermeleobas/173/base 2025-12-04T09:17:10.5066398Z * [new branch] gh/guilhermeleobas/173/head -> origin/gh/guilhermeleobas/173/head 2025-12-04T09:17:10.5068522Z * [new branch] gh/guilhermeleobas/173/orig -> origin/gh/guilhermeleobas/173/orig 2025-12-04T09:17:10.5071462Z * [new branch] gh/guilhermeleobas/193/base -> origin/gh/guilhermeleobas/193/base 2025-12-04T09:17:10.5073537Z * [new branch] gh/guilhermeleobas/193/head -> origin/gh/guilhermeleobas/193/head 2025-12-04T09:17:10.5075892Z * [new branch] gh/guilhermeleobas/193/orig -> origin/gh/guilhermeleobas/193/orig 2025-12-04T09:17:10.5078725Z * [new branch] gh/guilhermeleobas/204/base -> origin/gh/guilhermeleobas/204/base 2025-12-04T09:17:10.5080938Z * [new branch] gh/guilhermeleobas/204/head -> origin/gh/guilhermeleobas/204/head 2025-12-04T09:17:10.5083011Z * [new branch] gh/guilhermeleobas/204/orig -> origin/gh/guilhermeleobas/204/orig 2025-12-04T09:17:10.5085956Z * [new branch] gh/guilhermeleobas/211/base -> origin/gh/guilhermeleobas/211/base 2025-12-04T09:17:10.5088148Z * [new branch] gh/guilhermeleobas/211/head -> origin/gh/guilhermeleobas/211/head 2025-12-04T09:17:10.5090373Z * [new branch] gh/guilhermeleobas/211/orig -> origin/gh/guilhermeleobas/211/orig 2025-12-04T09:17:10.5093509Z * [new branch] gh/guilhermeleobas/226/base -> origin/gh/guilhermeleobas/226/base 2025-12-04T09:17:10.5095504Z * [new branch] gh/guilhermeleobas/226/head -> origin/gh/guilhermeleobas/226/head 2025-12-04T09:17:10.5097600Z * [new branch] gh/guilhermeleobas/226/orig -> origin/gh/guilhermeleobas/226/orig 2025-12-04T09:17:10.5100490Z * [new branch] gh/guilhermeleobas/236/base -> origin/gh/guilhermeleobas/236/base 2025-12-04T09:17:10.5102817Z * [new branch] gh/guilhermeleobas/236/head -> origin/gh/guilhermeleobas/236/head 2025-12-04T09:17:10.5105192Z * [new branch] gh/guilhermeleobas/236/orig -> origin/gh/guilhermeleobas/236/orig 2025-12-04T09:17:10.5108012Z * [new branch] gh/guilhermeleobas/247/base -> origin/gh/guilhermeleobas/247/base 2025-12-04T09:17:10.5110184Z * [new branch] gh/guilhermeleobas/247/head -> origin/gh/guilhermeleobas/247/head 2025-12-04T09:17:10.5112385Z * [new branch] gh/guilhermeleobas/247/orig -> origin/gh/guilhermeleobas/247/orig 2025-12-04T09:17:10.5115235Z * [new branch] gh/guilhermeleobas/248/base -> origin/gh/guilhermeleobas/248/base 2025-12-04T09:17:10.5117363Z * [new branch] gh/guilhermeleobas/248/head -> origin/gh/guilhermeleobas/248/head 2025-12-04T09:17:10.5119592Z * [new branch] gh/guilhermeleobas/248/orig -> origin/gh/guilhermeleobas/248/orig 2025-12-04T09:17:10.5122543Z * [new branch] gh/guilhermeleobas/250/base -> origin/gh/guilhermeleobas/250/base 2025-12-04T09:17:10.5124835Z * [new branch] gh/guilhermeleobas/250/head -> origin/gh/guilhermeleobas/250/head 2025-12-04T09:17:10.5127102Z * [new branch] gh/guilhermeleobas/250/orig -> origin/gh/guilhermeleobas/250/orig 2025-12-04T09:17:10.5130572Z * [new branch] gh/guilhermeleobas/253/base -> origin/gh/guilhermeleobas/253/base 2025-12-04T09:17:10.5132704Z * [new branch] gh/guilhermeleobas/253/head -> origin/gh/guilhermeleobas/253/head 2025-12-04T09:17:10.5135509Z * [new branch] gh/guilhermeleobas/253/orig -> origin/gh/guilhermeleobas/253/orig 2025-12-04T09:17:10.5138186Z * [new branch] gh/guilhermeleobas/254/base -> origin/gh/guilhermeleobas/254/base 2025-12-04T09:17:10.5140331Z * [new branch] gh/guilhermeleobas/254/head -> origin/gh/guilhermeleobas/254/head 2025-12-04T09:17:10.5142613Z * [new branch] gh/guilhermeleobas/254/orig -> origin/gh/guilhermeleobas/254/orig 2025-12-04T09:17:10.5145672Z * [new branch] gh/guilhermeleobas/255/base -> origin/gh/guilhermeleobas/255/base 2025-12-04T09:17:10.5147852Z * [new branch] gh/guilhermeleobas/255/head -> origin/gh/guilhermeleobas/255/head 2025-12-04T09:17:10.5150256Z * [new branch] gh/guilhermeleobas/255/orig -> origin/gh/guilhermeleobas/255/orig 2025-12-04T09:17:10.5153653Z * [new branch] gh/guilhermeleobas/256/base -> origin/gh/guilhermeleobas/256/base 2025-12-04T09:17:10.5155951Z * [new branch] gh/guilhermeleobas/256/head -> origin/gh/guilhermeleobas/256/head 2025-12-04T09:17:10.5158136Z * [new branch] gh/guilhermeleobas/256/orig -> origin/gh/guilhermeleobas/256/orig 2025-12-04T09:17:10.5160994Z * [new branch] gh/guilhermeleobas/257/base -> origin/gh/guilhermeleobas/257/base 2025-12-04T09:17:10.5163115Z * [new branch] gh/guilhermeleobas/257/head -> origin/gh/guilhermeleobas/257/head 2025-12-04T09:17:10.5165476Z * [new branch] gh/guilhermeleobas/257/orig -> origin/gh/guilhermeleobas/257/orig 2025-12-04T09:17:10.5168418Z * [new branch] gh/guilhermeleobas/258/base -> origin/gh/guilhermeleobas/258/base 2025-12-04T09:17:10.5170618Z * [new branch] gh/guilhermeleobas/258/head -> origin/gh/guilhermeleobas/258/head 2025-12-04T09:17:10.5172786Z * [new branch] gh/guilhermeleobas/258/orig -> origin/gh/guilhermeleobas/258/orig 2025-12-04T09:17:10.5175734Z * [new branch] gh/guilhermeleobas/259/base -> origin/gh/guilhermeleobas/259/base 2025-12-04T09:17:10.5177876Z * [new branch] gh/guilhermeleobas/259/head -> origin/gh/guilhermeleobas/259/head 2025-12-04T09:17:10.5180145Z * [new branch] gh/guilhermeleobas/259/orig -> origin/gh/guilhermeleobas/259/orig 2025-12-04T09:17:10.5183237Z * [new branch] gh/guilhermeleobas/260/base -> origin/gh/guilhermeleobas/260/base 2025-12-04T09:17:10.5185471Z * [new branch] gh/guilhermeleobas/260/head -> origin/gh/guilhermeleobas/260/head 2025-12-04T09:17:10.5190493Z * [new branch] gh/guilhermeleobas/260/orig -> origin/gh/guilhermeleobas/260/orig 2025-12-04T09:17:10.5191354Z * [new branch] gh/guilhermeleobas/261/base -> origin/gh/guilhermeleobas/261/base 2025-12-04T09:17:10.5192030Z * [new branch] gh/guilhermeleobas/261/head -> origin/gh/guilhermeleobas/261/head 2025-12-04T09:17:10.5194665Z * [new branch] gh/guilhermeleobas/261/orig -> origin/gh/guilhermeleobas/261/orig 2025-12-04T09:17:10.5197624Z * [new branch] gh/guilhermeleobas/262/base -> origin/gh/guilhermeleobas/262/base 2025-12-04T09:17:10.5199885Z * [new branch] gh/guilhermeleobas/262/head -> origin/gh/guilhermeleobas/262/head 2025-12-04T09:17:10.5201998Z * [new branch] gh/guilhermeleobas/262/orig -> origin/gh/guilhermeleobas/262/orig 2025-12-04T09:17:10.5205560Z * [new branch] gh/guilhermeleobas/263/base -> origin/gh/guilhermeleobas/263/base 2025-12-04T09:17:10.5206707Z * [new branch] gh/guilhermeleobas/263/head -> origin/gh/guilhermeleobas/263/head 2025-12-04T09:17:10.5209048Z * [new branch] gh/guilhermeleobas/263/orig -> origin/gh/guilhermeleobas/263/orig 2025-12-04T09:17:10.5212070Z * [new branch] gh/guilhermeleobas/264/base -> origin/gh/guilhermeleobas/264/base 2025-12-04T09:17:10.5214208Z * [new branch] gh/guilhermeleobas/264/head -> origin/gh/guilhermeleobas/264/head 2025-12-04T09:17:10.5216354Z * [new branch] gh/guilhermeleobas/264/orig -> origin/gh/guilhermeleobas/264/orig 2025-12-04T09:17:10.5219305Z * [new branch] gh/guilhermeleobas/265/base -> origin/gh/guilhermeleobas/265/base 2025-12-04T09:17:10.5221458Z * [new branch] gh/guilhermeleobas/265/head -> origin/gh/guilhermeleobas/265/head 2025-12-04T09:17:10.5223926Z * [new branch] gh/guilhermeleobas/265/orig -> origin/gh/guilhermeleobas/265/orig 2025-12-04T09:17:10.5227214Z * [new branch] gh/guilhermeleobas/266/base -> origin/gh/guilhermeleobas/266/base 2025-12-04T09:17:10.5229236Z * [new branch] gh/guilhermeleobas/266/head -> origin/gh/guilhermeleobas/266/head 2025-12-04T09:17:10.5231416Z * [new branch] gh/guilhermeleobas/266/orig -> origin/gh/guilhermeleobas/266/orig 2025-12-04T09:17:10.5235006Z * [new branch] gh/guilhermeleobas/267/base -> origin/gh/guilhermeleobas/267/base 2025-12-04T09:17:10.5237021Z * [new branch] gh/guilhermeleobas/267/head -> origin/gh/guilhermeleobas/267/head 2025-12-04T09:17:10.5239016Z * [new branch] gh/guilhermeleobas/267/orig -> origin/gh/guilhermeleobas/267/orig 2025-12-04T09:17:10.5242517Z * [new branch] gh/hameerabbasi/1/base -> origin/gh/hameerabbasi/1/base 2025-12-04T09:17:10.5244711Z * [new branch] gh/hameerabbasi/1/head -> origin/gh/hameerabbasi/1/head 2025-12-04T09:17:10.5247440Z * [new branch] gh/hameerabbasi/2/base -> origin/gh/hameerabbasi/2/base 2025-12-04T09:17:10.5249642Z * [new branch] gh/hameerabbasi/2/head -> origin/gh/hameerabbasi/2/head 2025-12-04T09:17:10.5251931Z * [new branch] gh/hameerabbasi/2/orig -> origin/gh/hameerabbasi/2/orig 2025-12-04T09:17:10.5254764Z * [new branch] gh/hameerabbasi/3/base -> origin/gh/hameerabbasi/3/base 2025-12-04T09:17:10.5257031Z * [new branch] gh/hameerabbasi/3/head -> origin/gh/hameerabbasi/3/head 2025-12-04T09:17:10.5259350Z * [new branch] gh/hameerabbasi/3/orig -> origin/gh/hameerabbasi/3/orig 2025-12-04T09:17:10.5262148Z * [new branch] gh/hameerabbasi/4/base -> origin/gh/hameerabbasi/4/base 2025-12-04T09:17:10.5264427Z * [new branch] gh/hameerabbasi/4/head -> origin/gh/hameerabbasi/4/head 2025-12-04T09:17:10.5266495Z * [new branch] gh/hameerabbasi/4/orig -> origin/gh/hameerabbasi/4/orig 2025-12-04T09:17:10.5269812Z * [new branch] gh/huydhn/1/next -> origin/gh/huydhn/1/next 2025-12-04T09:17:10.5272646Z * [new branch] gh/huydhn/2/next -> origin/gh/huydhn/2/next 2025-12-04T09:17:10.5275345Z * [new branch] gh/huydhn/3/next -> origin/gh/huydhn/3/next 2025-12-04T09:17:10.5278294Z * [new branch] gh/huydhn/4/next -> origin/gh/huydhn/4/next 2025-12-04T09:17:10.5281142Z * [new branch] gh/huydhn/5/next -> origin/gh/huydhn/5/next 2025-12-04T09:17:10.5284032Z * [new branch] gh/huydhn/6/next -> origin/gh/huydhn/6/next 2025-12-04T09:17:10.5288083Z * [new branch] gh/int3/97/base -> origin/gh/int3/97/base 2025-12-04T09:17:10.5290146Z * [new branch] gh/int3/97/head -> origin/gh/int3/97/head 2025-12-04T09:17:10.5293753Z * [new branch] gh/isuruf/101/base -> origin/gh/isuruf/101/base 2025-12-04T09:17:10.5295725Z * [new branch] gh/isuruf/101/head -> origin/gh/isuruf/101/head 2025-12-04T09:17:10.5298495Z * [new branch] gh/isuruf/146/base -> origin/gh/isuruf/146/base 2025-12-04T09:17:10.5300769Z * [new branch] gh/isuruf/146/head -> origin/gh/isuruf/146/head 2025-12-04T09:17:10.5302990Z * [new branch] gh/isuruf/146/orig -> origin/gh/isuruf/146/orig 2025-12-04T09:17:10.5305970Z * [new branch] gh/isuruf/158/base -> origin/gh/isuruf/158/base 2025-12-04T09:17:10.5308156Z * [new branch] gh/isuruf/158/head -> origin/gh/isuruf/158/head 2025-12-04T09:17:10.5310953Z * [new branch] gh/isuruf/159/base -> origin/gh/isuruf/159/base 2025-12-04T09:17:10.5313090Z * [new branch] gh/isuruf/159/head -> origin/gh/isuruf/159/head 2025-12-04T09:17:10.5315911Z * [new branch] gh/isuruf/160/base -> origin/gh/isuruf/160/base 2025-12-04T09:17:10.5317979Z * [new branch] gh/isuruf/160/head -> origin/gh/isuruf/160/head 2025-12-04T09:17:10.5320184Z * [new branch] gh/isuruf/160/orig -> origin/gh/isuruf/160/orig 2025-12-04T09:17:10.5322968Z * [new branch] gh/isuruf/81/base -> origin/gh/isuruf/81/base 2025-12-04T09:17:10.5325158Z * [new branch] gh/isuruf/81/head -> origin/gh/isuruf/81/head 2025-12-04T09:17:10.5327404Z * [new branch] gh/isuruf/81/orig -> origin/gh/isuruf/81/orig 2025-12-04T09:17:10.5330789Z * [new branch] gh/jamesjwu/176/base -> origin/gh/jamesjwu/176/base 2025-12-04T09:17:10.5333340Z * [new branch] gh/jamesjwu/176/head -> origin/gh/jamesjwu/176/head 2025-12-04T09:17:10.5335564Z * [new branch] gh/jamesjwu/176/orig -> origin/gh/jamesjwu/176/orig 2025-12-04T09:17:10.5338404Z * [new branch] gh/jamesjwu/187/base -> origin/gh/jamesjwu/187/base 2025-12-04T09:17:10.5340523Z * [new branch] gh/jamesjwu/187/head -> origin/gh/jamesjwu/187/head 2025-12-04T09:17:10.5342717Z * [new branch] gh/jamesjwu/187/orig -> origin/gh/jamesjwu/187/orig 2025-12-04T09:17:10.5345648Z * [new branch] gh/jamesjwu/196/base -> origin/gh/jamesjwu/196/base 2025-12-04T09:17:10.5347774Z * [new branch] gh/jamesjwu/196/head -> origin/gh/jamesjwu/196/head 2025-12-04T09:17:10.5350027Z * [new branch] gh/jamesjwu/196/orig -> origin/gh/jamesjwu/196/orig 2025-12-04T09:17:10.5352828Z * [new branch] gh/jamesjwu/198/base -> origin/gh/jamesjwu/198/base 2025-12-04T09:17:10.5354974Z * [new branch] gh/jamesjwu/198/head -> origin/gh/jamesjwu/198/head 2025-12-04T09:17:10.5357266Z * [new branch] gh/jamesjwu/198/orig -> origin/gh/jamesjwu/198/orig 2025-12-04T09:17:10.5360105Z * [new branch] gh/jamesjwu/207/base -> origin/gh/jamesjwu/207/base 2025-12-04T09:17:10.5362399Z * [new branch] gh/jamesjwu/207/head -> origin/gh/jamesjwu/207/head 2025-12-04T09:17:10.5364586Z * [new branch] gh/jamesjwu/207/orig -> origin/gh/jamesjwu/207/orig 2025-12-04T09:17:10.5367549Z * [new branch] gh/jamesjwu/208/base -> origin/gh/jamesjwu/208/base 2025-12-04T09:17:10.5369711Z * [new branch] gh/jamesjwu/208/head -> origin/gh/jamesjwu/208/head 2025-12-04T09:17:10.5371885Z * [new branch] gh/jamesjwu/208/orig -> origin/gh/jamesjwu/208/orig 2025-12-04T09:17:10.5374791Z * [new branch] gh/jamesjwu/52/base -> origin/gh/jamesjwu/52/base 2025-12-04T09:17:10.5376992Z * [new branch] gh/jamesjwu/52/head -> origin/gh/jamesjwu/52/head 2025-12-04T09:17:10.5379861Z * [new branch] gh/jamesjwu/53/base -> origin/gh/jamesjwu/53/base 2025-12-04T09:17:10.5381793Z * [new branch] gh/jamesjwu/53/head -> origin/gh/jamesjwu/53/head 2025-12-04T09:17:10.5384729Z * [new branch] gh/jamesjwu/54/base -> origin/gh/jamesjwu/54/base 2025-12-04T09:17:10.5386992Z * [new branch] gh/jamesjwu/54/head -> origin/gh/jamesjwu/54/head 2025-12-04T09:17:10.5389647Z * [new branch] gh/jamesjwu/55/base -> origin/gh/jamesjwu/55/base 2025-12-04T09:17:10.5391779Z * [new branch] gh/jamesjwu/55/head -> origin/gh/jamesjwu/55/head 2025-12-04T09:17:10.5394448Z * [new branch] gh/jamesjwu/56/base -> origin/gh/jamesjwu/56/base 2025-12-04T09:17:10.5396583Z * [new branch] gh/jamesjwu/56/head -> origin/gh/jamesjwu/56/head 2025-12-04T09:17:10.5399297Z * [new branch] gh/jamesjwu/57/base -> origin/gh/jamesjwu/57/base 2025-12-04T09:17:10.5401425Z * [new branch] gh/jamesjwu/57/head -> origin/gh/jamesjwu/57/head 2025-12-04T09:17:10.5404126Z * [new branch] gh/jamesjwu/58/base -> origin/gh/jamesjwu/58/base 2025-12-04T09:17:10.5406280Z * [new branch] gh/jamesjwu/58/head -> origin/gh/jamesjwu/58/head 2025-12-04T09:17:10.5408978Z * [new branch] gh/jamesjwu/59/base -> origin/gh/jamesjwu/59/base 2025-12-04T09:17:10.5411269Z * [new branch] gh/jamesjwu/59/head -> origin/gh/jamesjwu/59/head 2025-12-04T09:17:10.5414193Z * [new branch] gh/jamesjwu/60/base -> origin/gh/jamesjwu/60/base 2025-12-04T09:17:10.5416359Z * [new branch] gh/jamesjwu/60/head -> origin/gh/jamesjwu/60/head 2025-12-04T09:17:10.5419058Z * [new branch] gh/jamesjwu/61/base -> origin/gh/jamesjwu/61/base 2025-12-04T09:17:10.5421163Z * [new branch] gh/jamesjwu/61/head -> origin/gh/jamesjwu/61/head 2025-12-04T09:17:10.5424941Z * [new branch] gh/jamesjwu/62/base -> origin/gh/jamesjwu/62/base 2025-12-04T09:17:10.5428953Z * [new branch] gh/jamesjwu/62/head -> origin/gh/jamesjwu/62/head 2025-12-04T09:17:10.5431645Z * [new branch] gh/jamesjwu/63/base -> origin/gh/jamesjwu/63/base 2025-12-04T09:17:10.5433769Z * [new branch] gh/jamesjwu/63/head -> origin/gh/jamesjwu/63/head 2025-12-04T09:17:10.5437282Z * [new branch] gh/jamesjwu/64/base -> origin/gh/jamesjwu/64/base 2025-12-04T09:17:10.5439929Z * [new branch] gh/jamesjwu/64/head -> origin/gh/jamesjwu/64/head 2025-12-04T09:17:10.5442870Z * [new branch] gh/jamesjwu/65/base -> origin/gh/jamesjwu/65/base 2025-12-04T09:17:10.5444931Z * [new branch] gh/jamesjwu/65/head -> origin/gh/jamesjwu/65/head 2025-12-04T09:17:10.5448433Z * [new branch] gh/janeyx99/165/base -> origin/gh/janeyx99/165/base 2025-12-04T09:17:10.5450661Z * [new branch] gh/janeyx99/165/head -> origin/gh/janeyx99/165/head 2025-12-04T09:17:10.5452808Z * [new branch] gh/janeyx99/165/orig -> origin/gh/janeyx99/165/orig 2025-12-04T09:17:10.5455560Z * [new branch] gh/janeyx99/201/base -> origin/gh/janeyx99/201/base 2025-12-04T09:17:10.5457706Z * [new branch] gh/janeyx99/201/head -> origin/gh/janeyx99/201/head 2025-12-04T09:17:10.5459860Z * [new branch] gh/janeyx99/201/orig -> origin/gh/janeyx99/201/orig 2025-12-04T09:17:10.5463099Z * [new branch] gh/janeyx99/225/base -> origin/gh/janeyx99/225/base 2025-12-04T09:17:10.5465337Z * [new branch] gh/janeyx99/225/head -> origin/gh/janeyx99/225/head 2025-12-04T09:17:10.5467487Z * [new branch] gh/janeyx99/225/orig -> origin/gh/janeyx99/225/orig 2025-12-04T09:17:10.5470437Z * [new branch] gh/janeyx99/299/base -> origin/gh/janeyx99/299/base 2025-12-04T09:17:10.5472762Z * [new branch] gh/janeyx99/299/head -> origin/gh/janeyx99/299/head 2025-12-04T09:17:10.5474736Z * [new branch] gh/janeyx99/299/orig -> origin/gh/janeyx99/299/orig 2025-12-04T09:17:10.5478462Z * [new branch] gh/janeyx99/302/base -> origin/gh/janeyx99/302/base 2025-12-04T09:17:10.5480634Z * [new branch] gh/janeyx99/302/head -> origin/gh/janeyx99/302/head 2025-12-04T09:17:10.5483337Z * [new branch] gh/janeyx99/303/base -> origin/gh/janeyx99/303/base 2025-12-04T09:17:10.5485435Z * [new branch] gh/janeyx99/303/head -> origin/gh/janeyx99/303/head 2025-12-04T09:17:10.5488352Z * [new branch] gh/janeyx99/305/base -> origin/gh/janeyx99/305/base 2025-12-04T09:17:10.5490532Z * [new branch] gh/janeyx99/305/head -> origin/gh/janeyx99/305/head 2025-12-04T09:17:10.5493286Z * [new branch] gh/janeyx99/306/base -> origin/gh/janeyx99/306/base 2025-12-04T09:17:10.5495393Z * [new branch] gh/janeyx99/306/head -> origin/gh/janeyx99/306/head 2025-12-04T09:17:10.5498307Z * [new branch] gh/janeyx99/314/base -> origin/gh/janeyx99/314/base 2025-12-04T09:17:10.5500508Z * [new branch] gh/janeyx99/314/head -> origin/gh/janeyx99/314/head 2025-12-04T09:17:10.5502753Z * [new branch] gh/janeyx99/314/orig -> origin/gh/janeyx99/314/orig 2025-12-04T09:17:10.5505734Z * [new branch] gh/janeyx99/315/base -> origin/gh/janeyx99/315/base 2025-12-04T09:17:10.5507907Z * [new branch] gh/janeyx99/315/head -> origin/gh/janeyx99/315/head 2025-12-04T09:17:10.5510090Z * [new branch] gh/janeyx99/315/orig -> origin/gh/janeyx99/315/orig 2025-12-04T09:17:10.5513017Z * [new branch] gh/janeyx99/316/base -> origin/gh/janeyx99/316/base 2025-12-04T09:17:10.5515188Z * [new branch] gh/janeyx99/316/head -> origin/gh/janeyx99/316/head 2025-12-04T09:17:10.5517346Z * [new branch] gh/janeyx99/316/orig -> origin/gh/janeyx99/316/orig 2025-12-04T09:17:10.5520383Z * [new branch] gh/janeyx99/317/base -> origin/gh/janeyx99/317/base 2025-12-04T09:17:10.5522451Z * [new branch] gh/janeyx99/317/head -> origin/gh/janeyx99/317/head 2025-12-04T09:17:10.5524945Z * [new branch] gh/janeyx99/317/orig -> origin/gh/janeyx99/317/orig 2025-12-04T09:17:10.5528001Z * [new branch] gh/janeyx99/325/base -> origin/gh/janeyx99/325/base 2025-12-04T09:17:10.5530199Z * [new branch] gh/janeyx99/325/head -> origin/gh/janeyx99/325/head 2025-12-04T09:17:10.5532362Z * [new branch] gh/janeyx99/325/orig -> origin/gh/janeyx99/325/orig 2025-12-04T09:17:10.5535226Z * [new branch] gh/janeyx99/327/base -> origin/gh/janeyx99/327/base 2025-12-04T09:17:10.5537395Z * [new branch] gh/janeyx99/327/head -> origin/gh/janeyx99/327/head 2025-12-04T09:17:10.5539546Z * [new branch] gh/janeyx99/327/orig -> origin/gh/janeyx99/327/orig 2025-12-04T09:17:10.5542425Z * [new branch] gh/janeyx99/328/base -> origin/gh/janeyx99/328/base 2025-12-04T09:17:10.5544804Z * [new branch] gh/janeyx99/328/head -> origin/gh/janeyx99/328/head 2025-12-04T09:17:10.5546942Z * [new branch] gh/janeyx99/328/orig -> origin/gh/janeyx99/328/orig 2025-12-04T09:17:10.5549694Z * [new branch] gh/janeyx99/329/base -> origin/gh/janeyx99/329/base 2025-12-04T09:17:10.5551911Z * [new branch] gh/janeyx99/329/head -> origin/gh/janeyx99/329/head 2025-12-04T09:17:10.5554045Z * [new branch] gh/janeyx99/329/orig -> origin/gh/janeyx99/329/orig 2025-12-04T09:17:10.5557684Z * [new branch] gh/janeyx99/330/base -> origin/gh/janeyx99/330/base 2025-12-04T09:17:10.5560385Z * [new branch] gh/janeyx99/330/head -> origin/gh/janeyx99/330/head 2025-12-04T09:17:10.5562003Z * [new branch] gh/janeyx99/330/orig -> origin/gh/janeyx99/330/orig 2025-12-04T09:17:10.5564889Z * [new branch] gh/janeyx99/331/base -> origin/gh/janeyx99/331/base 2025-12-04T09:17:10.5567087Z * [new branch] gh/janeyx99/331/head -> origin/gh/janeyx99/331/head 2025-12-04T09:17:10.5569162Z * [new branch] gh/janeyx99/331/orig -> origin/gh/janeyx99/331/orig 2025-12-04T09:17:10.5572307Z * [new branch] gh/janeyx99/332/base -> origin/gh/janeyx99/332/base 2025-12-04T09:17:10.5574294Z * [new branch] gh/janeyx99/332/head -> origin/gh/janeyx99/332/head 2025-12-04T09:17:10.5576354Z * [new branch] gh/janeyx99/332/orig -> origin/gh/janeyx99/332/orig 2025-12-04T09:17:10.5579075Z * [new branch] gh/janeyx99/333/base -> origin/gh/janeyx99/333/base 2025-12-04T09:17:10.5581176Z * [new branch] gh/janeyx99/333/head -> origin/gh/janeyx99/333/head 2025-12-04T09:17:10.5583982Z * [new branch] gh/janeyx99/333/orig -> origin/gh/janeyx99/333/orig 2025-12-04T09:17:10.5587139Z * [new branch] gh/janeyx99/88/base -> origin/gh/janeyx99/88/base 2025-12-04T09:17:10.5589237Z * [new branch] gh/janeyx99/88/head -> origin/gh/janeyx99/88/head 2025-12-04T09:17:10.5591390Z * [new branch] gh/janeyx99/88/orig -> origin/gh/janeyx99/88/orig 2025-12-04T09:17:10.5594811Z * [new branch] gh/jansel/360/base -> origin/gh/jansel/360/base 2025-12-04T09:17:10.5596913Z * [new branch] gh/jansel/360/head -> origin/gh/jansel/360/head 2025-12-04T09:17:10.5599714Z * [new branch] gh/jansel/451/base -> origin/gh/jansel/451/base 2025-12-04T09:17:10.5601879Z * [new branch] gh/jansel/451/head -> origin/gh/jansel/451/head 2025-12-04T09:17:10.5603979Z * [new branch] gh/jansel/451/orig -> origin/gh/jansel/451/orig 2025-12-04T09:17:10.5606810Z * [new branch] gh/jansel/462/base -> origin/gh/jansel/462/base 2025-12-04T09:17:10.5609021Z * [new branch] gh/jansel/462/head -> origin/gh/jansel/462/head 2025-12-04T09:17:10.5611146Z * [new branch] gh/jansel/462/orig -> origin/gh/jansel/462/orig 2025-12-04T09:17:10.5614063Z * [new branch] gh/jansel/533/base -> origin/gh/jansel/533/base 2025-12-04T09:17:10.5616141Z * [new branch] gh/jansel/533/head -> origin/gh/jansel/533/head 2025-12-04T09:17:10.5618280Z * [new branch] gh/jansel/533/orig -> origin/gh/jansel/533/orig 2025-12-04T09:17:10.5621094Z * [new branch] gh/jansel/552/base -> origin/gh/jansel/552/base 2025-12-04T09:17:10.5623302Z * [new branch] gh/jansel/552/head -> origin/gh/jansel/552/head 2025-12-04T09:17:10.5625802Z * [new branch] gh/jansel/552/orig -> origin/gh/jansel/552/orig 2025-12-04T09:17:10.5628693Z * [new branch] gh/jansel/553/base -> origin/gh/jansel/553/base 2025-12-04T09:17:10.5630831Z * [new branch] gh/jansel/553/head -> origin/gh/jansel/553/head 2025-12-04T09:17:10.5632962Z * [new branch] gh/jansel/553/orig -> origin/gh/jansel/553/orig 2025-12-04T09:17:10.5635809Z * [new branch] gh/jansel/554/base -> origin/gh/jansel/554/base 2025-12-04T09:17:10.5638018Z * [new branch] gh/jansel/554/head -> origin/gh/jansel/554/head 2025-12-04T09:17:10.5640154Z * [new branch] gh/jansel/554/orig -> origin/gh/jansel/554/orig 2025-12-04T09:17:10.5643100Z * [new branch] gh/jansel/555/base -> origin/gh/jansel/555/base 2025-12-04T09:17:10.5645328Z * [new branch] gh/jansel/555/head -> origin/gh/jansel/555/head 2025-12-04T09:17:10.5647475Z * [new branch] gh/jansel/555/orig -> origin/gh/jansel/555/orig 2025-12-04T09:17:10.5650279Z * [new branch] gh/jansel/556/base -> origin/gh/jansel/556/base 2025-12-04T09:17:10.5652632Z * [new branch] gh/jansel/556/head -> origin/gh/jansel/556/head 2025-12-04T09:17:10.5654587Z * [new branch] gh/jansel/556/orig -> origin/gh/jansel/556/orig 2025-12-04T09:17:10.5657462Z * [new branch] gh/jansel/557/base -> origin/gh/jansel/557/base 2025-12-04T09:17:10.5659579Z * [new branch] gh/jansel/557/head -> origin/gh/jansel/557/head 2025-12-04T09:17:10.5661758Z * [new branch] gh/jansel/557/orig -> origin/gh/jansel/557/orig 2025-12-04T09:17:10.5664811Z * [new branch] gh/jansel/558/base -> origin/gh/jansel/558/base 2025-12-04T09:17:10.5666926Z * [new branch] gh/jansel/558/head -> origin/gh/jansel/558/head 2025-12-04T09:17:10.5669047Z * [new branch] gh/jansel/558/orig -> origin/gh/jansel/558/orig 2025-12-04T09:17:10.5671975Z * [new branch] gh/jansel/559/base -> origin/gh/jansel/559/base 2025-12-04T09:17:10.5674096Z * [new branch] gh/jansel/559/head -> origin/gh/jansel/559/head 2025-12-04T09:17:10.5676497Z * [new branch] gh/jansel/559/orig -> origin/gh/jansel/559/orig 2025-12-04T09:17:10.5679615Z * [new branch] gh/jansel/560/base -> origin/gh/jansel/560/base 2025-12-04T09:17:10.5681687Z * [new branch] gh/jansel/560/head -> origin/gh/jansel/560/head 2025-12-04T09:17:10.5683851Z * [new branch] gh/jansel/560/orig -> origin/gh/jansel/560/orig 2025-12-04T09:17:10.5686647Z * [new branch] gh/jansel/561/base -> origin/gh/jansel/561/base 2025-12-04T09:17:10.5688792Z * [new branch] gh/jansel/561/head -> origin/gh/jansel/561/head 2025-12-04T09:17:10.5690886Z * [new branch] gh/jansel/561/orig -> origin/gh/jansel/561/orig 2025-12-04T09:17:10.5693740Z * [new branch] gh/jansel/562/base -> origin/gh/jansel/562/base 2025-12-04T09:17:10.5695826Z * [new branch] gh/jansel/562/head -> origin/gh/jansel/562/head 2025-12-04T09:17:10.5697890Z * [new branch] gh/jansel/562/orig -> origin/gh/jansel/562/orig 2025-12-04T09:17:10.5700825Z * [new branch] gh/jansel/563/base -> origin/gh/jansel/563/base 2025-12-04T09:17:10.5703040Z * [new branch] gh/jansel/563/head -> origin/gh/jansel/563/head 2025-12-04T09:17:10.5705277Z * [new branch] gh/jansel/563/orig -> origin/gh/jansel/563/orig 2025-12-04T09:17:10.5708639Z * [new branch] gh/jansel/564/base -> origin/gh/jansel/564/base 2025-12-04T09:17:10.5710772Z * [new branch] gh/jansel/564/head -> origin/gh/jansel/564/head 2025-12-04T09:17:10.5712956Z * [new branch] gh/jansel/564/orig -> origin/gh/jansel/564/orig 2025-12-04T09:17:10.5715886Z * [new branch] gh/jansel/565/base -> origin/gh/jansel/565/base 2025-12-04T09:17:10.5718145Z * [new branch] gh/jansel/565/head -> origin/gh/jansel/565/head 2025-12-04T09:17:10.5720276Z * [new branch] gh/jansel/565/orig -> origin/gh/jansel/565/orig 2025-12-04T09:17:10.5723269Z * [new branch] gh/jansel/566/base -> origin/gh/jansel/566/base 2025-12-04T09:17:10.5725432Z * [new branch] gh/jansel/566/head -> origin/gh/jansel/566/head 2025-12-04T09:17:10.5727656Z * [new branch] gh/jansel/566/orig -> origin/gh/jansel/566/orig 2025-12-04T09:17:10.5730687Z * [new branch] gh/jansel/567/base -> origin/gh/jansel/567/base 2025-12-04T09:17:10.5732911Z * [new branch] gh/jansel/567/head -> origin/gh/jansel/567/head 2025-12-04T09:17:10.5734879Z * [new branch] gh/jansel/567/orig -> origin/gh/jansel/567/orig 2025-12-04T09:17:10.5737878Z * [new branch] gh/jansel/568/base -> origin/gh/jansel/568/base 2025-12-04T09:17:10.5740053Z * [new branch] gh/jansel/568/head -> origin/gh/jansel/568/head 2025-12-04T09:17:10.5742199Z * [new branch] gh/jansel/568/orig -> origin/gh/jansel/568/orig 2025-12-04T09:17:10.5745278Z * [new branch] gh/jansel/569/base -> origin/gh/jansel/569/base 2025-12-04T09:17:10.5747381Z * [new branch] gh/jansel/569/head -> origin/gh/jansel/569/head 2025-12-04T09:17:10.5749514Z * [new branch] gh/jansel/569/orig -> origin/gh/jansel/569/orig 2025-12-04T09:17:10.5752372Z * [new branch] gh/jansel/570/base -> origin/gh/jansel/570/base 2025-12-04T09:17:10.5754542Z * [new branch] gh/jansel/570/head -> origin/gh/jansel/570/head 2025-12-04T09:17:10.5756606Z * [new branch] gh/jansel/570/orig -> origin/gh/jansel/570/orig 2025-12-04T09:17:10.5759604Z * [new branch] gh/jansel/571/base -> origin/gh/jansel/571/base 2025-12-04T09:17:10.5761690Z * [new branch] gh/jansel/571/head -> origin/gh/jansel/571/head 2025-12-04T09:17:10.5763836Z * [new branch] gh/jansel/571/orig -> origin/gh/jansel/571/orig 2025-12-04T09:17:10.5766640Z * [new branch] gh/jansel/572/base -> origin/gh/jansel/572/base 2025-12-04T09:17:10.5768768Z * [new branch] gh/jansel/572/head -> origin/gh/jansel/572/head 2025-12-04T09:17:10.5770879Z * [new branch] gh/jansel/572/orig -> origin/gh/jansel/572/orig 2025-12-04T09:17:10.5773864Z * [new branch] gh/jansel/573/base -> origin/gh/jansel/573/base 2025-12-04T09:17:10.5775945Z * [new branch] gh/jansel/573/head -> origin/gh/jansel/573/head 2025-12-04T09:17:10.5778048Z * [new branch] gh/jansel/573/orig -> origin/gh/jansel/573/orig 2025-12-04T09:17:10.5780961Z * [new branch] gh/jansel/574/base -> origin/gh/jansel/574/base 2025-12-04T09:17:10.5783177Z * [new branch] gh/jansel/574/head -> origin/gh/jansel/574/head 2025-12-04T09:17:10.5785451Z * [new branch] gh/jansel/574/orig -> origin/gh/jansel/574/orig 2025-12-04T09:17:10.5788412Z * [new branch] gh/jansel/575/base -> origin/gh/jansel/575/base 2025-12-04T09:17:10.5790593Z * [new branch] gh/jansel/575/head -> origin/gh/jansel/575/head 2025-12-04T09:17:10.5792710Z * [new branch] gh/jansel/575/orig -> origin/gh/jansel/575/orig 2025-12-04T09:17:10.5795634Z * [new branch] gh/jansel/576/base -> origin/gh/jansel/576/base 2025-12-04T09:17:10.5797846Z * [new branch] gh/jansel/576/head -> origin/gh/jansel/576/head 2025-12-04T09:17:10.5799977Z * [new branch] gh/jansel/576/orig -> origin/gh/jansel/576/orig 2025-12-04T09:17:10.5803421Z * [new branch] gh/jbschlosser/247/base -> origin/gh/jbschlosser/247/base 2025-12-04T09:17:10.5805542Z * [new branch] gh/jbschlosser/247/head -> origin/gh/jbschlosser/247/head 2025-12-04T09:17:10.5807678Z * [new branch] gh/jbschlosser/247/orig -> origin/gh/jbschlosser/247/orig 2025-12-04T09:17:10.5810590Z * [new branch] gh/jbschlosser/250/base -> origin/gh/jbschlosser/250/base 2025-12-04T09:17:10.5812589Z * [new branch] gh/jbschlosser/250/head -> origin/gh/jbschlosser/250/head 2025-12-04T09:17:10.5815168Z * [new branch] gh/jbschlosser/250/orig -> origin/gh/jbschlosser/250/orig 2025-12-04T09:17:10.5818770Z * [new branch] gh/jerryzh168/1/base -> origin/gh/jerryzh168/1/base 2025-12-04T09:17:10.5820735Z * [new branch] gh/jerryzh168/1/head -> origin/gh/jerryzh168/1/head 2025-12-04T09:17:10.5822890Z * [new branch] gh/jerryzh168/1/orig -> origin/gh/jerryzh168/1/orig 2025-12-04T09:17:10.5828603Z * [new branch] gh/jiayisunx/59/base -> origin/gh/jiayisunx/59/base 2025-12-04T09:17:10.5830702Z * [new branch] gh/jiayisunx/59/head -> origin/gh/jiayisunx/59/head 2025-12-04T09:17:10.5832843Z * [new branch] gh/jiayisunx/59/orig -> origin/gh/jiayisunx/59/orig 2025-12-04T09:17:10.5835564Z * [new branch] gh/jiayisunx/61/base -> origin/gh/jiayisunx/61/base 2025-12-04T09:17:10.5837708Z * [new branch] gh/jiayisunx/61/head -> origin/gh/jiayisunx/61/head 2025-12-04T09:17:10.5839851Z * [new branch] gh/jiayisunx/61/orig -> origin/gh/jiayisunx/61/orig 2025-12-04T09:17:10.5842834Z * [new branch] gh/jiayisunx/68/base -> origin/gh/jiayisunx/68/base 2025-12-04T09:17:10.5844838Z * [new branch] gh/jiayisunx/68/head -> origin/gh/jiayisunx/68/head 2025-12-04T09:17:10.5847034Z * [new branch] gh/jiayisunx/68/orig -> origin/gh/jiayisunx/68/orig 2025-12-04T09:17:10.5850316Z * [new branch] gh/jiayisunx/77/base -> origin/gh/jiayisunx/77/base 2025-12-04T09:17:10.5852442Z * [new branch] gh/jiayisunx/77/head -> origin/gh/jiayisunx/77/head 2025-12-04T09:17:10.5854518Z * [new branch] gh/jiayisunx/77/orig -> origin/gh/jiayisunx/77/orig 2025-12-04T09:17:10.5857483Z * [new branch] gh/jiayisunx/78/base -> origin/gh/jiayisunx/78/base 2025-12-04T09:17:10.5859556Z * [new branch] gh/jiayisunx/78/head -> origin/gh/jiayisunx/78/head 2025-12-04T09:17:10.5861711Z * [new branch] gh/jiayisunx/78/orig -> origin/gh/jiayisunx/78/orig 2025-12-04T09:17:10.5864679Z * [new branch] gh/jiayisunx/79/base -> origin/gh/jiayisunx/79/base 2025-12-04T09:17:10.5866819Z * [new branch] gh/jiayisunx/79/head -> origin/gh/jiayisunx/79/head 2025-12-04T09:17:10.5868940Z * [new branch] gh/jiayisunx/79/orig -> origin/gh/jiayisunx/79/orig 2025-12-04T09:17:10.5871895Z * [new branch] gh/jiayisunx/82/base -> origin/gh/jiayisunx/82/base 2025-12-04T09:17:10.5873993Z * [new branch] gh/jiayisunx/82/head -> origin/gh/jiayisunx/82/head 2025-12-04T09:17:10.5876107Z * [new branch] gh/jiayisunx/82/orig -> origin/gh/jiayisunx/82/orig 2025-12-04T09:17:10.5878983Z * [new branch] gh/jiayisunx/83/base -> origin/gh/jiayisunx/83/base 2025-12-04T09:17:10.5881113Z * [new branch] gh/jiayisunx/83/head -> origin/gh/jiayisunx/83/head 2025-12-04T09:17:10.5883315Z * [new branch] gh/jiayisunx/83/orig -> origin/gh/jiayisunx/83/orig 2025-12-04T09:17:10.5886016Z * [new branch] gh/jiayisunx/84/base -> origin/gh/jiayisunx/84/base 2025-12-04T09:17:10.5888149Z * [new branch] gh/jiayisunx/84/head -> origin/gh/jiayisunx/84/head 2025-12-04T09:17:10.5890295Z * [new branch] gh/jiayisunx/84/orig -> origin/gh/jiayisunx/84/orig 2025-12-04T09:17:10.5893160Z * [new branch] gh/jiayisunx/85/base -> origin/gh/jiayisunx/85/base 2025-12-04T09:17:10.5895257Z * [new branch] gh/jiayisunx/85/head -> origin/gh/jiayisunx/85/head 2025-12-04T09:17:10.5897378Z * [new branch] gh/jiayisunx/85/orig -> origin/gh/jiayisunx/85/orig 2025-12-04T09:17:10.5900218Z * [new branch] gh/jiayisunx/86/base -> origin/gh/jiayisunx/86/base 2025-12-04T09:17:10.5902336Z * [new branch] gh/jiayisunx/86/head -> origin/gh/jiayisunx/86/head 2025-12-04T09:17:10.5904929Z * [new branch] gh/jiayisunx/86/orig -> origin/gh/jiayisunx/86/orig 2025-12-04T09:17:10.5907532Z * [new branch] gh/jiayisunx/87/base -> origin/gh/jiayisunx/87/base 2025-12-04T09:17:10.5909652Z * [new branch] gh/jiayisunx/87/head -> origin/gh/jiayisunx/87/head 2025-12-04T09:17:10.5919506Z * [new branch] gh/jiayisunx/87/orig -> origin/gh/jiayisunx/87/orig 2025-12-04T09:17:10.5919871Z * [new branch] gh/jiayisunx/88/base -> origin/gh/jiayisunx/88/base 2025-12-04T09:17:10.5920096Z * [new branch] gh/jiayisunx/88/head -> origin/gh/jiayisunx/88/head 2025-12-04T09:17:10.5920311Z * [new branch] gh/jiayisunx/88/orig -> origin/gh/jiayisunx/88/orig 2025-12-04T09:17:10.5920872Z * [new branch] gh/jiayisunx/89/base -> origin/gh/jiayisunx/89/base 2025-12-04T09:17:10.5923215Z * [new branch] gh/jiayisunx/89/head -> origin/gh/jiayisunx/89/head 2025-12-04T09:17:10.5925544Z * [new branch] gh/jiayisunx/89/orig -> origin/gh/jiayisunx/89/orig 2025-12-04T09:17:10.5928433Z * [new branch] gh/jiayisunx/90/base -> origin/gh/jiayisunx/90/base 2025-12-04T09:17:10.5930481Z * [new branch] gh/jiayisunx/90/head -> origin/gh/jiayisunx/90/head 2025-12-04T09:17:10.5932678Z * [new branch] gh/jiayisunx/90/orig -> origin/gh/jiayisunx/90/orig 2025-12-04T09:17:10.5936417Z * [new branch] gh/jjwu@meta.com/1/base -> origin/gh/jjwu@meta.com/1/base 2025-12-04T09:17:10.5938410Z * [new branch] gh/jjwu@meta.com/1/head -> origin/gh/jjwu@meta.com/1/head 2025-12-04T09:17:10.5941755Z * [new branch] gh/jturney/1/base -> origin/gh/jturney/1/base 2025-12-04T09:17:10.5944040Z * [new branch] gh/jturney/1/head -> origin/gh/jturney/1/head 2025-12-04T09:17:10.5946257Z * [new branch] gh/jturney/1/orig -> origin/gh/jturney/1/orig 2025-12-04T09:17:10.5949056Z * [new branch] gh/jturney/2/base -> origin/gh/jturney/2/base 2025-12-04T09:17:10.5951153Z * [new branch] gh/jturney/2/head -> origin/gh/jturney/2/head 2025-12-04T09:17:10.5953312Z * [new branch] gh/jturney/2/orig -> origin/gh/jturney/2/orig 2025-12-04T09:17:10.5957039Z * [new branch] gh/karthickai/10/base -> origin/gh/karthickai/10/base 2025-12-04T09:17:10.5959178Z * [new branch] gh/karthickai/10/head -> origin/gh/karthickai/10/head 2025-12-04T09:17:10.5961266Z * [new branch] gh/karthickai/10/orig -> origin/gh/karthickai/10/orig 2025-12-04T09:17:10.5964073Z * [new branch] gh/karthickai/11/base -> origin/gh/karthickai/11/base 2025-12-04T09:17:10.5966276Z * [new branch] gh/karthickai/11/head -> origin/gh/karthickai/11/head 2025-12-04T09:17:10.5968425Z * [new branch] gh/karthickai/11/orig -> origin/gh/karthickai/11/orig 2025-12-04T09:17:10.5971537Z * [new branch] gh/karthickai/12/base -> origin/gh/karthickai/12/base 2025-12-04T09:17:10.5973694Z * [new branch] gh/karthickai/12/head -> origin/gh/karthickai/12/head 2025-12-04T09:17:10.5975879Z * [new branch] gh/karthickai/12/orig -> origin/gh/karthickai/12/orig 2025-12-04T09:17:10.5978644Z * [new branch] gh/karthickai/13/base -> origin/gh/karthickai/13/base 2025-12-04T09:17:10.5980809Z * [new branch] gh/karthickai/13/head -> origin/gh/karthickai/13/head 2025-12-04T09:17:10.5983146Z * [new branch] gh/karthickai/13/orig -> origin/gh/karthickai/13/orig 2025-12-04T09:17:10.5986770Z * [new branch] gh/karthickai/14/base -> origin/gh/karthickai/14/base 2025-12-04T09:17:10.5988920Z * [new branch] gh/karthickai/14/head -> origin/gh/karthickai/14/head 2025-12-04T09:17:10.5991233Z * [new branch] gh/karthickai/14/orig -> origin/gh/karthickai/14/orig 2025-12-04T09:17:10.5994180Z * [new branch] gh/karthickai/15/base -> origin/gh/karthickai/15/base 2025-12-04T09:17:10.5996303Z * [new branch] gh/karthickai/15/head -> origin/gh/karthickai/15/head 2025-12-04T09:17:10.5998464Z * [new branch] gh/karthickai/15/orig -> origin/gh/karthickai/15/orig 2025-12-04T09:17:10.6001199Z * [new branch] gh/karthickai/16/base -> origin/gh/karthickai/16/base 2025-12-04T09:17:10.6003304Z * [new branch] gh/karthickai/16/head -> origin/gh/karthickai/16/head 2025-12-04T09:17:10.6005978Z * [new branch] gh/karthickai/16/orig -> origin/gh/karthickai/16/orig 2025-12-04T09:17:10.6008690Z * [new branch] gh/karthickai/17/base -> origin/gh/karthickai/17/base 2025-12-04T09:17:10.6010743Z * [new branch] gh/karthickai/17/head -> origin/gh/karthickai/17/head 2025-12-04T09:17:10.6012955Z * [new branch] gh/karthickai/17/orig -> origin/gh/karthickai/17/orig 2025-12-04T09:17:10.6015884Z * [new branch] gh/karthickai/18/base -> origin/gh/karthickai/18/base 2025-12-04T09:17:10.6018200Z * [new branch] gh/karthickai/18/head -> origin/gh/karthickai/18/head 2025-12-04T09:17:10.6020414Z * [new branch] gh/karthickai/18/orig -> origin/gh/karthickai/18/orig 2025-12-04T09:17:10.6023451Z * [new branch] gh/karthickai/19/base -> origin/gh/karthickai/19/base 2025-12-04T09:17:10.6025871Z * [new branch] gh/karthickai/19/head -> origin/gh/karthickai/19/head 2025-12-04T09:17:10.6028000Z * [new branch] gh/karthickai/19/orig -> origin/gh/karthickai/19/orig 2025-12-04T09:17:10.6031753Z * [new branch] gh/karthickai/20/base -> origin/gh/karthickai/20/base 2025-12-04T09:17:10.6034584Z * [new branch] gh/karthickai/20/head -> origin/gh/karthickai/20/head 2025-12-04T09:17:10.6036724Z * [new branch] gh/karthickai/20/orig -> origin/gh/karthickai/20/orig 2025-12-04T09:17:10.6039603Z * [new branch] gh/karthickai/21/base -> origin/gh/karthickai/21/base 2025-12-04T09:17:10.6041894Z * [new branch] gh/karthickai/21/head -> origin/gh/karthickai/21/head 2025-12-04T09:17:10.6044135Z * [new branch] gh/karthickai/21/orig -> origin/gh/karthickai/21/orig 2025-12-04T09:17:10.6047113Z * [new branch] gh/karthickai/22/base -> origin/gh/karthickai/22/base 2025-12-04T09:17:10.6049120Z * [new branch] gh/karthickai/22/head -> origin/gh/karthickai/22/head 2025-12-04T09:17:10.6051167Z * [new branch] gh/karthickai/22/orig -> origin/gh/karthickai/22/orig 2025-12-04T09:17:10.6054200Z * [new branch] gh/karthickai/23/base -> origin/gh/karthickai/23/base 2025-12-04T09:17:10.6056428Z * [new branch] gh/karthickai/23/head -> origin/gh/karthickai/23/head 2025-12-04T09:17:10.6058609Z * [new branch] gh/karthickai/23/orig -> origin/gh/karthickai/23/orig 2025-12-04T09:17:10.6061478Z * [new branch] gh/karthickai/24/base -> origin/gh/karthickai/24/base 2025-12-04T09:17:10.6063764Z * [new branch] gh/karthickai/24/head -> origin/gh/karthickai/24/head 2025-12-04T09:17:10.6065833Z * [new branch] gh/karthickai/24/orig -> origin/gh/karthickai/24/orig 2025-12-04T09:17:10.6069200Z * [new branch] gh/karthickai/25/base -> origin/gh/karthickai/25/base 2025-12-04T09:17:10.6071370Z * [new branch] gh/karthickai/25/head -> origin/gh/karthickai/25/head 2025-12-04T09:17:10.6073643Z * [new branch] gh/karthickai/25/orig -> origin/gh/karthickai/25/orig 2025-12-04T09:17:10.6076365Z * [new branch] gh/karthickai/26/base -> origin/gh/karthickai/26/base 2025-12-04T09:17:10.6078771Z * [new branch] gh/karthickai/26/head -> origin/gh/karthickai/26/head 2025-12-04T09:17:10.6080734Z * [new branch] gh/karthickai/26/orig -> origin/gh/karthickai/26/orig 2025-12-04T09:17:10.6085030Z * [new branch] gh/karthickai/6/base -> origin/gh/karthickai/6/base 2025-12-04T09:17:10.6087650Z * [new branch] gh/karthickai/6/head -> origin/gh/karthickai/6/head 2025-12-04T09:17:10.6089757Z * [new branch] gh/karthickai/6/orig -> origin/gh/karthickai/6/orig 2025-12-04T09:17:10.6093149Z * [new branch] gh/krocki/1/base -> origin/gh/krocki/1/base 2025-12-04T09:17:10.6095278Z * [new branch] gh/krocki/1/head -> origin/gh/krocki/1/head 2025-12-04T09:17:10.6097324Z * [new branch] gh/krocki/1/orig -> origin/gh/krocki/1/orig 2025-12-04T09:17:10.6100246Z * [new branch] gh/krocki/2/base -> origin/gh/krocki/2/base 2025-12-04T09:17:10.6102419Z * [new branch] gh/krocki/2/head -> origin/gh/krocki/2/head 2025-12-04T09:17:10.6104742Z * [new branch] gh/krocki/2/orig -> origin/gh/krocki/2/orig 2025-12-04T09:17:10.6108718Z * [new branch] gh/kurtamohler/60/base -> origin/gh/kurtamohler/60/base 2025-12-04T09:17:10.6110812Z * [new branch] gh/kurtamohler/60/head -> origin/gh/kurtamohler/60/head 2025-12-04T09:17:10.6112918Z * [new branch] gh/kurtamohler/60/orig -> origin/gh/kurtamohler/60/orig 2025-12-04T09:17:10.6115708Z * [new branch] gh/kurtamohler/61/base -> origin/gh/kurtamohler/61/base 2025-12-04T09:17:10.6117785Z * [new branch] gh/kurtamohler/61/head -> origin/gh/kurtamohler/61/head 2025-12-04T09:17:10.6119884Z * [new branch] gh/kurtamohler/61/orig -> origin/gh/kurtamohler/61/orig 2025-12-04T09:17:10.6122676Z * [new branch] gh/kurtamohler/62/base -> origin/gh/kurtamohler/62/base 2025-12-04T09:17:10.6125098Z * [new branch] gh/kurtamohler/62/head -> origin/gh/kurtamohler/62/head 2025-12-04T09:17:10.6127719Z * [new branch] gh/kurtamohler/62/orig -> origin/gh/kurtamohler/62/orig 2025-12-04T09:17:10.6130634Z * [new branch] gh/kurtamohler/63/base -> origin/gh/kurtamohler/63/base 2025-12-04T09:17:10.6132735Z * [new branch] gh/kurtamohler/63/head -> origin/gh/kurtamohler/63/head 2025-12-04T09:17:10.6134992Z * [new branch] gh/kurtamohler/63/orig -> origin/gh/kurtamohler/63/orig 2025-12-04T09:17:10.6137809Z * [new branch] gh/kurtamohler/64/base -> origin/gh/kurtamohler/64/base 2025-12-04T09:17:10.6139871Z * [new branch] gh/kurtamohler/64/head -> origin/gh/kurtamohler/64/head 2025-12-04T09:17:10.6141983Z * [new branch] gh/kurtamohler/64/orig -> origin/gh/kurtamohler/64/orig 2025-12-04T09:17:10.6145053Z * [new branch] gh/kurtamohler/65/base -> origin/gh/kurtamohler/65/base 2025-12-04T09:17:10.6147139Z * [new branch] gh/kurtamohler/65/head -> origin/gh/kurtamohler/65/head 2025-12-04T09:17:10.6149178Z * [new branch] gh/kurtamohler/65/orig -> origin/gh/kurtamohler/65/orig 2025-12-04T09:17:10.6151978Z * [new branch] gh/kurtamohler/66/base -> origin/gh/kurtamohler/66/base 2025-12-04T09:17:10.6154158Z * [new branch] gh/kurtamohler/66/head -> origin/gh/kurtamohler/66/head 2025-12-04T09:17:10.6156296Z * [new branch] gh/kurtamohler/66/orig -> origin/gh/kurtamohler/66/orig 2025-12-04T09:17:10.6159205Z * [new branch] gh/kurtamohler/67/base -> origin/gh/kurtamohler/67/base 2025-12-04T09:17:10.6161314Z * [new branch] gh/kurtamohler/67/head -> origin/gh/kurtamohler/67/head 2025-12-04T09:17:10.6164260Z * [new branch] gh/kurtamohler/67/orig -> origin/gh/kurtamohler/67/orig 2025-12-04T09:17:10.6167529Z * [new branch] gh/kwen2501/130/base -> origin/gh/kwen2501/130/base 2025-12-04T09:17:10.6169827Z * [new branch] gh/kwen2501/130/head -> origin/gh/kwen2501/130/head 2025-12-04T09:17:10.6171941Z * [new branch] gh/kwen2501/130/orig -> origin/gh/kwen2501/130/orig 2025-12-04T09:17:10.6174877Z * [new branch] gh/kwen2501/170/base -> origin/gh/kwen2501/170/base 2025-12-04T09:17:10.6177004Z * [new branch] gh/kwen2501/170/head -> origin/gh/kwen2501/170/head 2025-12-04T09:17:10.6179822Z * [new branch] gh/kwen2501/187/base -> origin/gh/kwen2501/187/base 2025-12-04T09:17:10.6182000Z * [new branch] gh/kwen2501/187/head -> origin/gh/kwen2501/187/head 2025-12-04T09:17:10.6184267Z * [new branch] gh/kwen2501/187/orig -> origin/gh/kwen2501/187/orig 2025-12-04T09:17:10.6187224Z * [new branch] gh/kwen2501/188/base -> origin/gh/kwen2501/188/base 2025-12-04T09:17:10.6189292Z * [new branch] gh/kwen2501/188/head -> origin/gh/kwen2501/188/head 2025-12-04T09:17:10.6191433Z * [new branch] gh/kwen2501/188/orig -> origin/gh/kwen2501/188/orig 2025-12-04T09:17:10.6194257Z * [new branch] gh/kwen2501/211/base -> origin/gh/kwen2501/211/base 2025-12-04T09:17:10.6196679Z * [new branch] gh/kwen2501/211/head -> origin/gh/kwen2501/211/head 2025-12-04T09:17:10.6199472Z * [new branch] gh/kwen2501/224/base -> origin/gh/kwen2501/224/base 2025-12-04T09:17:10.6201574Z * [new branch] gh/kwen2501/224/head -> origin/gh/kwen2501/224/head 2025-12-04T09:17:10.6203648Z * [new branch] gh/kwen2501/224/orig -> origin/gh/kwen2501/224/orig 2025-12-04T09:17:10.6206531Z * [new branch] gh/kwen2501/228/base -> origin/gh/kwen2501/228/base 2025-12-04T09:17:10.6209231Z * [new branch] gh/kwen2501/228/head -> origin/gh/kwen2501/228/head 2025-12-04T09:17:10.6211405Z * [new branch] gh/kwen2501/228/orig -> origin/gh/kwen2501/228/orig 2025-12-04T09:17:10.6214343Z * [new branch] gh/kwen2501/234/base -> origin/gh/kwen2501/234/base 2025-12-04T09:17:10.6216566Z * [new branch] gh/kwen2501/234/head -> origin/gh/kwen2501/234/head 2025-12-04T09:17:10.6218624Z * [new branch] gh/kwen2501/234/orig -> origin/gh/kwen2501/234/orig 2025-12-04T09:17:10.6221437Z * [new branch] gh/kwen2501/235/base -> origin/gh/kwen2501/235/base 2025-12-04T09:17:10.6223555Z * [new branch] gh/kwen2501/235/head -> origin/gh/kwen2501/235/head 2025-12-04T09:17:10.6227954Z * [new branch] gh/kwen2501/235/orig -> origin/gh/kwen2501/235/orig 2025-12-04T09:17:10.6230729Z * [new branch] gh/kwen2501/236/base -> origin/gh/kwen2501/236/base 2025-12-04T09:17:10.6232896Z * [new branch] gh/kwen2501/236/head -> origin/gh/kwen2501/236/head 2025-12-04T09:17:10.6234992Z * [new branch] gh/kwen2501/236/orig -> origin/gh/kwen2501/236/orig 2025-12-04T09:17:10.6238331Z * [new branch] gh/kwen2501/237/base -> origin/gh/kwen2501/237/base 2025-12-04T09:17:10.6240415Z * [new branch] gh/kwen2501/237/head -> origin/gh/kwen2501/237/head 2025-12-04T09:17:10.6242566Z * [new branch] gh/kwen2501/237/orig -> origin/gh/kwen2501/237/orig 2025-12-04T09:17:10.6245388Z * [new branch] gh/kwen2501/238/base -> origin/gh/kwen2501/238/base 2025-12-04T09:17:10.6247580Z * [new branch] gh/kwen2501/238/head -> origin/gh/kwen2501/238/head 2025-12-04T09:17:10.6250116Z * [new branch] gh/kwen2501/238/orig -> origin/gh/kwen2501/238/orig 2025-12-04T09:17:10.6252843Z * [new branch] gh/kwen2501/240/base -> origin/gh/kwen2501/240/base 2025-12-04T09:17:10.6254756Z * [new branch] gh/kwen2501/240/head -> origin/gh/kwen2501/240/head 2025-12-04T09:17:10.6256904Z * [new branch] gh/kwen2501/240/orig -> origin/gh/kwen2501/240/orig 2025-12-04T09:17:10.6259628Z * [new branch] gh/kwen2501/241/base -> origin/gh/kwen2501/241/base 2025-12-04T09:17:10.6261739Z * [new branch] gh/kwen2501/241/head -> origin/gh/kwen2501/241/head 2025-12-04T09:17:10.6263996Z * [new branch] gh/kwen2501/241/orig -> origin/gh/kwen2501/241/orig 2025-12-04T09:17:10.6266866Z * [new branch] gh/kwen2501/247/base -> origin/gh/kwen2501/247/base 2025-12-04T09:17:10.6268969Z * [new branch] gh/kwen2501/247/head -> origin/gh/kwen2501/247/head 2025-12-04T09:17:10.6271066Z * [new branch] gh/kwen2501/247/orig -> origin/gh/kwen2501/247/orig 2025-12-04T09:17:10.6273882Z * [new branch] gh/kwen2501/252/base -> origin/gh/kwen2501/252/base 2025-12-04T09:17:10.6276090Z * [new branch] gh/kwen2501/252/head -> origin/gh/kwen2501/252/head 2025-12-04T09:17:10.6278219Z * [new branch] gh/kwen2501/252/orig -> origin/gh/kwen2501/252/orig 2025-12-04T09:17:10.6282164Z * [new branch] gh/kwen2501/259/base -> origin/gh/kwen2501/259/base 2025-12-04T09:17:10.6284341Z * [new branch] gh/kwen2501/259/head -> origin/gh/kwen2501/259/head 2025-12-04T09:17:10.6286459Z * [new branch] gh/kwen2501/259/orig -> origin/gh/kwen2501/259/orig 2025-12-04T09:17:10.6289546Z * [new branch] gh/kwen2501/260/base -> origin/gh/kwen2501/260/base 2025-12-04T09:17:10.6291740Z * [new branch] gh/kwen2501/260/head -> origin/gh/kwen2501/260/head 2025-12-04T09:17:10.6293882Z * [new branch] gh/kwen2501/260/orig -> origin/gh/kwen2501/260/orig 2025-12-04T09:17:10.6296797Z * [new branch] gh/kwen2501/268/base -> origin/gh/kwen2501/268/base 2025-12-04T09:17:10.6298870Z * [new branch] gh/kwen2501/268/head -> origin/gh/kwen2501/268/head 2025-12-04T09:17:10.6300980Z * [new branch] gh/kwen2501/268/orig -> origin/gh/kwen2501/268/orig 2025-12-04T09:17:10.6304004Z * [new branch] gh/kwen2501/269/base -> origin/gh/kwen2501/269/base 2025-12-04T09:17:10.6306279Z * [new branch] gh/kwen2501/269/head -> origin/gh/kwen2501/269/head 2025-12-04T09:17:10.6308402Z * [new branch] gh/kwen2501/269/orig -> origin/gh/kwen2501/269/orig 2025-12-04T09:17:10.6311334Z * [new branch] gh/kwen2501/270/base -> origin/gh/kwen2501/270/base 2025-12-04T09:17:10.6313530Z * [new branch] gh/kwen2501/270/head -> origin/gh/kwen2501/270/head 2025-12-04T09:17:10.6315731Z * [new branch] gh/kwen2501/270/orig -> origin/gh/kwen2501/270/orig 2025-12-04T09:17:10.6318834Z * [new branch] gh/kwen2501/271/base -> origin/gh/kwen2501/271/base 2025-12-04T09:17:10.6320996Z * [new branch] gh/kwen2501/271/head -> origin/gh/kwen2501/271/head 2025-12-04T09:17:10.6323100Z * [new branch] gh/kwen2501/271/orig -> origin/gh/kwen2501/271/orig 2025-12-04T09:17:10.6326283Z * [new branch] gh/kwen2501/274/base -> origin/gh/kwen2501/274/base 2025-12-04T09:17:10.6328498Z * [new branch] gh/kwen2501/274/head -> origin/gh/kwen2501/274/head 2025-12-04T09:17:10.6330567Z * [new branch] gh/kwen2501/274/orig -> origin/gh/kwen2501/274/orig 2025-12-04T09:17:10.6334157Z * [new branch] gh/kwen2501/275/base -> origin/gh/kwen2501/275/base 2025-12-04T09:17:10.6336459Z * [new branch] gh/kwen2501/275/head -> origin/gh/kwen2501/275/head 2025-12-04T09:17:10.6338690Z * [new branch] gh/kwen2501/275/orig -> origin/gh/kwen2501/275/orig 2025-12-04T09:17:10.6341490Z * [new branch] gh/kwen2501/276/base -> origin/gh/kwen2501/276/base 2025-12-04T09:17:10.6343663Z * [new branch] gh/kwen2501/276/head -> origin/gh/kwen2501/276/head 2025-12-04T09:17:10.6345817Z * [new branch] gh/kwen2501/276/orig -> origin/gh/kwen2501/276/orig 2025-12-04T09:17:10.6348808Z * [new branch] gh/kwen2501/277/base -> origin/gh/kwen2501/277/base 2025-12-04T09:17:10.6350975Z * [new branch] gh/kwen2501/277/head -> origin/gh/kwen2501/277/head 2025-12-04T09:17:10.6353076Z * [new branch] gh/kwen2501/277/orig -> origin/gh/kwen2501/277/orig 2025-12-04T09:17:10.6355993Z * [new branch] gh/kwen2501/278/base -> origin/gh/kwen2501/278/base 2025-12-04T09:17:10.6358101Z * [new branch] gh/kwen2501/278/head -> origin/gh/kwen2501/278/head 2025-12-04T09:17:10.6360157Z * [new branch] gh/kwen2501/278/orig -> origin/gh/kwen2501/278/orig 2025-12-04T09:17:10.6363111Z * [new branch] gh/kwen2501/279/base -> origin/gh/kwen2501/279/base 2025-12-04T09:17:10.6365420Z * [new branch] gh/kwen2501/279/head -> origin/gh/kwen2501/279/head 2025-12-04T09:17:10.6367611Z * [new branch] gh/kwen2501/279/orig -> origin/gh/kwen2501/279/orig 2025-12-04T09:17:10.6370484Z * [new branch] gh/kwen2501/280/base -> origin/gh/kwen2501/280/base 2025-12-04T09:17:10.6372671Z * [new branch] gh/kwen2501/280/head -> origin/gh/kwen2501/280/head 2025-12-04T09:17:10.6374904Z * [new branch] gh/kwen2501/280/orig -> origin/gh/kwen2501/280/orig 2025-12-04T09:17:10.6377762Z * [new branch] gh/kwen2501/281/base -> origin/gh/kwen2501/281/base 2025-12-04T09:17:10.6379812Z * [new branch] gh/kwen2501/281/head -> origin/gh/kwen2501/281/head 2025-12-04T09:17:10.6381949Z * [new branch] gh/kwen2501/281/orig -> origin/gh/kwen2501/281/orig 2025-12-04T09:17:10.6385088Z * [new branch] gh/kwen2501/282/base -> origin/gh/kwen2501/282/base 2025-12-04T09:17:10.6387171Z * [new branch] gh/kwen2501/282/head -> origin/gh/kwen2501/282/head 2025-12-04T09:17:10.6389319Z * [new branch] gh/kwen2501/282/orig -> origin/gh/kwen2501/282/orig 2025-12-04T09:17:10.6392181Z * [new branch] gh/kwen2501/283/base -> origin/gh/kwen2501/283/base 2025-12-04T09:17:10.6394493Z * [new branch] gh/kwen2501/283/head -> origin/gh/kwen2501/283/head 2025-12-04T09:17:10.6396569Z * [new branch] gh/kwen2501/283/orig -> origin/gh/kwen2501/283/orig 2025-12-04T09:17:10.6399480Z * [new branch] gh/kwen2501/284/base -> origin/gh/kwen2501/284/base 2025-12-04T09:17:10.6401657Z * [new branch] gh/kwen2501/284/head -> origin/gh/kwen2501/284/head 2025-12-04T09:17:10.6403850Z * [new branch] gh/kwen2501/284/orig -> origin/gh/kwen2501/284/orig 2025-12-04T09:17:10.6406703Z * [new branch] gh/kwen2501/285/base -> origin/gh/kwen2501/285/base 2025-12-04T09:17:10.6408816Z * [new branch] gh/kwen2501/285/head -> origin/gh/kwen2501/285/head 2025-12-04T09:17:10.6410934Z * [new branch] gh/kwen2501/285/orig -> origin/gh/kwen2501/285/orig 2025-12-04T09:17:10.6414409Z * [new branch] gh/kwen2501/286/base -> origin/gh/kwen2501/286/base 2025-12-04T09:17:10.6417079Z * [new branch] gh/kwen2501/286/head -> origin/gh/kwen2501/286/head 2025-12-04T09:17:10.6419196Z * [new branch] gh/kwen2501/286/orig -> origin/gh/kwen2501/286/orig 2025-12-04T09:17:10.6422007Z * [new branch] gh/kwen2501/287/base -> origin/gh/kwen2501/287/base 2025-12-04T09:17:10.6424777Z * [new branch] gh/kwen2501/287/head -> origin/gh/kwen2501/287/head 2025-12-04T09:17:10.6428699Z * [new branch] gh/kwen2501/287/orig -> origin/gh/kwen2501/287/orig 2025-12-04T09:17:10.6431683Z * [new branch] gh/kwen2501/288/base -> origin/gh/kwen2501/288/base 2025-12-04T09:17:10.6433920Z * [new branch] gh/kwen2501/288/head -> origin/gh/kwen2501/288/head 2025-12-04T09:17:10.6436830Z * [new branch] gh/kwen2501/288/orig -> origin/gh/kwen2501/288/orig 2025-12-04T09:17:10.6440028Z * [new branch] gh/laithsakka/251/base -> origin/gh/laithsakka/251/base 2025-12-04T09:17:10.6442268Z * [new branch] gh/laithsakka/251/head -> origin/gh/laithsakka/251/head 2025-12-04T09:17:10.6444381Z * [new branch] gh/laithsakka/251/orig -> origin/gh/laithsakka/251/orig 2025-12-04T09:17:10.6447250Z * [new branch] gh/laithsakka/276/base -> origin/gh/laithsakka/276/base 2025-12-04T09:17:10.6449316Z * [new branch] gh/laithsakka/276/head -> origin/gh/laithsakka/276/head 2025-12-04T09:17:10.6451490Z * [new branch] gh/laithsakka/276/orig -> origin/gh/laithsakka/276/orig 2025-12-04T09:17:10.6454525Z * [new branch] gh/laithsakka/28/base -> origin/gh/laithsakka/28/base 2025-12-04T09:17:10.6457219Z * [new branch] gh/laithsakka/29/base -> origin/gh/laithsakka/29/base 2025-12-04T09:17:10.6459841Z * [new branch] gh/laithsakka/30/base -> origin/gh/laithsakka/30/base 2025-12-04T09:17:10.6461946Z * [new branch] gh/laithsakka/30/head -> origin/gh/laithsakka/30/head 2025-12-04T09:17:10.6464784Z * [new branch] gh/laithsakka/31/base -> origin/gh/laithsakka/31/base 2025-12-04T09:17:10.6466848Z * [new branch] gh/laithsakka/31/head -> origin/gh/laithsakka/31/head 2025-12-04T09:17:10.6469812Z * [new branch] gh/laithsakka/313/base -> origin/gh/laithsakka/313/base 2025-12-04T09:17:10.6471909Z * [new branch] gh/laithsakka/313/head -> origin/gh/laithsakka/313/head 2025-12-04T09:17:10.6474047Z * [new branch] gh/laithsakka/313/orig -> origin/gh/laithsakka/313/orig 2025-12-04T09:17:10.6477099Z * [new branch] gh/laithsakka/316/base -> origin/gh/laithsakka/316/base 2025-12-04T09:17:10.6479143Z * [new branch] gh/laithsakka/316/head -> origin/gh/laithsakka/316/head 2025-12-04T09:17:10.6481331Z * [new branch] gh/laithsakka/316/orig -> origin/gh/laithsakka/316/orig 2025-12-04T09:17:10.6484152Z * [new branch] gh/laithsakka/317/base -> origin/gh/laithsakka/317/base 2025-12-04T09:17:10.6486250Z * [new branch] gh/laithsakka/317/head -> origin/gh/laithsakka/317/head 2025-12-04T09:17:10.6488332Z * [new branch] gh/laithsakka/317/orig -> origin/gh/laithsakka/317/orig 2025-12-04T09:17:10.6491219Z * [new branch] gh/laithsakka/319/base -> origin/gh/laithsakka/319/base 2025-12-04T09:17:10.6493350Z * [new branch] gh/laithsakka/319/head -> origin/gh/laithsakka/319/head 2025-12-04T09:17:10.6495512Z * [new branch] gh/laithsakka/319/orig -> origin/gh/laithsakka/319/orig 2025-12-04T09:17:10.6498230Z * [new branch] gh/laithsakka/32/base -> origin/gh/laithsakka/32/base 2025-12-04T09:17:10.6500304Z * [new branch] gh/laithsakka/32/head -> origin/gh/laithsakka/32/head 2025-12-04T09:17:10.6503363Z * [new branch] gh/laithsakka/320/base -> origin/gh/laithsakka/320/base 2025-12-04T09:17:10.6505483Z * [new branch] gh/laithsakka/320/head -> origin/gh/laithsakka/320/head 2025-12-04T09:17:10.6507606Z * [new branch] gh/laithsakka/320/orig -> origin/gh/laithsakka/320/orig 2025-12-04T09:17:10.6510534Z * [new branch] gh/laithsakka/321/base -> origin/gh/laithsakka/321/base 2025-12-04T09:17:10.6512802Z * [new branch] gh/laithsakka/321/head -> origin/gh/laithsakka/321/head 2025-12-04T09:17:10.6514793Z * [new branch] gh/laithsakka/321/orig -> origin/gh/laithsakka/321/orig 2025-12-04T09:17:10.6517774Z * [new branch] gh/laithsakka/322/base -> origin/gh/laithsakka/322/base 2025-12-04T09:17:10.6519981Z * [new branch] gh/laithsakka/322/head -> origin/gh/laithsakka/322/head 2025-12-04T09:17:10.6522149Z * [new branch] gh/laithsakka/322/orig -> origin/gh/laithsakka/322/orig 2025-12-04T09:17:10.6525191Z * [new branch] gh/laithsakka/323/base -> origin/gh/laithsakka/323/base 2025-12-04T09:17:10.6527564Z * [new branch] gh/laithsakka/323/head -> origin/gh/laithsakka/323/head 2025-12-04T09:17:10.6529667Z * [new branch] gh/laithsakka/323/orig -> origin/gh/laithsakka/323/orig 2025-12-04T09:17:10.6532656Z * [new branch] gh/laithsakka/324/base -> origin/gh/laithsakka/324/base 2025-12-04T09:17:10.6534681Z * [new branch] gh/laithsakka/324/head -> origin/gh/laithsakka/324/head 2025-12-04T09:17:10.6536742Z * [new branch] gh/laithsakka/324/orig -> origin/gh/laithsakka/324/orig 2025-12-04T09:17:10.6539700Z * [new branch] gh/laithsakka/325/base -> origin/gh/laithsakka/325/base 2025-12-04T09:17:10.6541818Z * [new branch] gh/laithsakka/325/head -> origin/gh/laithsakka/325/head 2025-12-04T09:17:10.6544098Z * [new branch] gh/laithsakka/325/orig -> origin/gh/laithsakka/325/orig 2025-12-04T09:17:10.6547418Z * [new branch] gh/laithsakka/326/base -> origin/gh/laithsakka/326/base 2025-12-04T09:17:10.6549552Z * [new branch] gh/laithsakka/326/head -> origin/gh/laithsakka/326/head 2025-12-04T09:17:10.6551727Z * [new branch] gh/laithsakka/326/orig -> origin/gh/laithsakka/326/orig 2025-12-04T09:17:10.6554672Z * [new branch] gh/laithsakka/327/base -> origin/gh/laithsakka/327/base 2025-12-04T09:17:10.6556844Z * [new branch] gh/laithsakka/327/head -> origin/gh/laithsakka/327/head 2025-12-04T09:17:10.6559042Z * [new branch] gh/laithsakka/327/orig -> origin/gh/laithsakka/327/orig 2025-12-04T09:17:10.6561918Z * [new branch] gh/laithsakka/328/base -> origin/gh/laithsakka/328/base 2025-12-04T09:17:10.6564175Z * [new branch] gh/laithsakka/328/head -> origin/gh/laithsakka/328/head 2025-12-04T09:17:10.6566254Z * [new branch] gh/laithsakka/328/orig -> origin/gh/laithsakka/328/orig 2025-12-04T09:17:10.6569664Z * [new branch] gh/liangel/4/base -> origin/gh/liangel/4/base 2025-12-04T09:17:10.6571795Z * [new branch] gh/liangel/4/head -> origin/gh/liangel/4/head 2025-12-04T09:17:10.6573901Z * [new branch] gh/liangel/4/orig -> origin/gh/liangel/4/orig 2025-12-04T09:17:10.6579205Z * [new branch] gh/lucaskabela/1/base -> origin/gh/lucaskabela/1/base 2025-12-04T09:17:10.6581271Z * [new branch] gh/lucaskabela/1/head -> origin/gh/lucaskabela/1/head 2025-12-04T09:17:10.6584769Z * [new branch] gh/lw/4/base -> origin/gh/lw/4/base 2025-12-04T09:17:10.6586855Z * [new branch] gh/lw/4/head -> origin/gh/lw/4/head 2025-12-04T09:17:10.6588929Z * [new branch] gh/lw/4/orig -> origin/gh/lw/4/orig 2025-12-04T09:17:10.6591684Z * [new branch] gh/lw/5/base -> origin/gh/lw/5/base 2025-12-04T09:17:10.6593841Z * [new branch] gh/lw/5/head -> origin/gh/lw/5/head 2025-12-04T09:17:10.6595919Z * [new branch] gh/lw/5/orig -> origin/gh/lw/5/orig 2025-12-04T09:17:10.6598708Z * [new branch] gh/lw/6/base -> origin/gh/lw/6/base 2025-12-04T09:17:10.6600990Z * [new branch] gh/lw/6/head -> origin/gh/lw/6/head 2025-12-04T09:17:10.6602983Z * [new branch] gh/lw/6/orig -> origin/gh/lw/6/orig 2025-12-04T09:17:10.6606268Z * [new branch] gh/malfet/14/base -> origin/gh/malfet/14/base 2025-12-04T09:17:10.6609159Z * [new branch] gh/malfet/417/base -> origin/gh/malfet/417/base 2025-12-04T09:17:10.6611234Z * [new branch] gh/malfet/417/head -> origin/gh/malfet/417/head 2025-12-04T09:17:10.6613352Z * [new branch] gh/malfet/417/orig -> origin/gh/malfet/417/orig 2025-12-04T09:17:10.6616158Z * [new branch] gh/malfet/506/base -> origin/gh/malfet/506/base 2025-12-04T09:17:10.6618360Z * [new branch] gh/malfet/506/head -> origin/gh/malfet/506/head 2025-12-04T09:17:10.6620553Z * [new branch] gh/malfet/506/orig -> origin/gh/malfet/506/orig 2025-12-04T09:17:10.6623463Z * [new branch] gh/malfet/517/base -> origin/gh/malfet/517/base 2025-12-04T09:17:10.6625939Z * [new branch] gh/malfet/517/head -> origin/gh/malfet/517/head 2025-12-04T09:17:10.6629266Z * [new branch] gh/malfet/528/base -> origin/gh/malfet/528/base 2025-12-04T09:17:10.6631411Z * [new branch] gh/malfet/528/head -> origin/gh/malfet/528/head 2025-12-04T09:17:10.6634266Z * [new branch] gh/malfet/528/orig -> origin/gh/malfet/528/orig 2025-12-04T09:17:10.6636710Z * [new branch] gh/malfet/537/base -> origin/gh/malfet/537/base 2025-12-04T09:17:10.6638873Z * [new branch] gh/malfet/537/head -> origin/gh/malfet/537/head 2025-12-04T09:17:10.6641272Z * [new branch] gh/malfet/537/orig -> origin/gh/malfet/537/orig 2025-12-04T09:17:10.6644056Z * [new branch] gh/malfet/546/base -> origin/gh/malfet/546/base 2025-12-04T09:17:10.6645950Z * [new branch] gh/malfet/546/head -> origin/gh/malfet/546/head 2025-12-04T09:17:10.6648081Z * [new branch] gh/malfet/546/orig -> origin/gh/malfet/546/orig 2025-12-04T09:17:10.6651031Z * [new branch] gh/malfet/565/base -> origin/gh/malfet/565/base 2025-12-04T09:17:10.6653139Z * [new branch] gh/malfet/565/head -> origin/gh/malfet/565/head 2025-12-04T09:17:10.6655240Z * [new branch] gh/malfet/565/orig -> origin/gh/malfet/565/orig 2025-12-04T09:17:10.6658134Z * [new branch] gh/malfet/575/base -> origin/gh/malfet/575/base 2025-12-04T09:17:10.6660248Z * [new branch] gh/malfet/575/head -> origin/gh/malfet/575/head 2025-12-04T09:17:10.6662331Z * [new branch] gh/malfet/575/orig -> origin/gh/malfet/575/orig 2025-12-04T09:17:10.6665279Z * [new branch] gh/malfet/580/base -> origin/gh/malfet/580/base 2025-12-04T09:17:10.6667347Z * [new branch] gh/malfet/580/head -> origin/gh/malfet/580/head 2025-12-04T09:17:10.6669509Z * [new branch] gh/malfet/580/orig -> origin/gh/malfet/580/orig 2025-12-04T09:17:10.6672252Z * [new branch] gh/malfet/581/base -> origin/gh/malfet/581/base 2025-12-04T09:17:10.6674357Z * [new branch] gh/malfet/581/head -> origin/gh/malfet/581/head 2025-12-04T09:17:10.6676474Z * [new branch] gh/malfet/581/orig -> origin/gh/malfet/581/orig 2025-12-04T09:17:10.6679400Z * [new branch] gh/malfet/583/base -> origin/gh/malfet/583/base 2025-12-04T09:17:10.6681427Z * [new branch] gh/malfet/583/head -> origin/gh/malfet/583/head 2025-12-04T09:17:10.6683576Z * [new branch] gh/malfet/583/orig -> origin/gh/malfet/583/orig 2025-12-04T09:17:10.6686384Z * [new branch] gh/malfet/586/base -> origin/gh/malfet/586/base 2025-12-04T09:17:10.6688651Z * [new branch] gh/malfet/586/head -> origin/gh/malfet/586/head 2025-12-04T09:17:10.6690574Z * [new branch] gh/malfet/586/orig -> origin/gh/malfet/586/orig 2025-12-04T09:17:10.6693370Z * [new branch] gh/malfet/587/base -> origin/gh/malfet/587/base 2025-12-04T09:17:10.6695464Z * [new branch] gh/malfet/587/head -> origin/gh/malfet/587/head 2025-12-04T09:17:10.6697554Z * [new branch] gh/malfet/587/orig -> origin/gh/malfet/587/orig 2025-12-04T09:17:10.6700344Z * [new branch] gh/malfet/588/base -> origin/gh/malfet/588/base 2025-12-04T09:17:10.6702565Z * [new branch] gh/malfet/588/head -> origin/gh/malfet/588/head 2025-12-04T09:17:10.6705498Z * [new branch] gh/malfet/588/orig -> origin/gh/malfet/588/orig 2025-12-04T09:17:10.6708472Z * [new branch] gh/malfet/589/base -> origin/gh/malfet/589/base 2025-12-04T09:17:10.6710650Z * [new branch] gh/malfet/589/head -> origin/gh/malfet/589/head 2025-12-04T09:17:10.6712791Z * [new branch] gh/malfet/589/orig -> origin/gh/malfet/589/orig 2025-12-04T09:17:10.6715553Z * [new branch] gh/malfet/590/base -> origin/gh/malfet/590/base 2025-12-04T09:17:10.6717729Z * [new branch] gh/malfet/590/head -> origin/gh/malfet/590/head 2025-12-04T09:17:10.6719901Z * [new branch] gh/malfet/590/orig -> origin/gh/malfet/590/orig 2025-12-04T09:17:10.6723209Z * [new branch] gh/malfet/591/base -> origin/gh/malfet/591/base 2025-12-04T09:17:10.6725404Z * [new branch] gh/malfet/591/head -> origin/gh/malfet/591/head 2025-12-04T09:17:10.6727690Z * [new branch] gh/malfet/591/orig -> origin/gh/malfet/591/orig 2025-12-04T09:17:10.6730596Z * [new branch] gh/malfet/592/base -> origin/gh/malfet/592/base 2025-12-04T09:17:10.6732781Z * [new branch] gh/malfet/592/head -> origin/gh/malfet/592/head 2025-12-04T09:17:10.6734903Z * [new branch] gh/malfet/592/orig -> origin/gh/malfet/592/orig 2025-12-04T09:17:10.6737863Z * [new branch] gh/malfet/593/base -> origin/gh/malfet/593/base 2025-12-04T09:17:10.6739908Z * [new branch] gh/malfet/593/head -> origin/gh/malfet/593/head 2025-12-04T09:17:10.6742068Z * [new branch] gh/malfet/593/orig -> origin/gh/malfet/593/orig 2025-12-04T09:17:10.6745152Z * [new branch] gh/malfet/594/base -> origin/gh/malfet/594/base 2025-12-04T09:17:10.6747837Z * [new branch] gh/malfet/594/head -> origin/gh/malfet/594/head 2025-12-04T09:17:10.6749972Z * [new branch] gh/malfet/594/orig -> origin/gh/malfet/594/orig 2025-12-04T09:17:10.6752759Z * [new branch] gh/malfet/595/base -> origin/gh/malfet/595/base 2025-12-04T09:17:10.6754883Z * [new branch] gh/malfet/595/head -> origin/gh/malfet/595/head 2025-12-04T09:17:10.6757079Z * [new branch] gh/malfet/595/orig -> origin/gh/malfet/595/orig 2025-12-04T09:17:10.6759946Z * [new branch] gh/malfet/596/base -> origin/gh/malfet/596/base 2025-12-04T09:17:10.6762506Z * [new branch] gh/malfet/596/head -> origin/gh/malfet/596/head 2025-12-04T09:17:10.6764744Z * [new branch] gh/malfet/596/orig -> origin/gh/malfet/596/orig 2025-12-04T09:17:10.6767703Z * [new branch] gh/malfet/597/base -> origin/gh/malfet/597/base 2025-12-04T09:17:10.6769811Z * [new branch] gh/malfet/597/head -> origin/gh/malfet/597/head 2025-12-04T09:17:10.6771928Z * [new branch] gh/malfet/597/orig -> origin/gh/malfet/597/orig 2025-12-04T09:17:10.6774799Z * [new branch] gh/malfet/598/base -> origin/gh/malfet/598/base 2025-12-04T09:17:10.6777129Z * [new branch] gh/malfet/598/head -> origin/gh/malfet/598/head 2025-12-04T09:17:10.6779110Z * [new branch] gh/malfet/598/orig -> origin/gh/malfet/598/orig 2025-12-04T09:17:10.6782006Z * [new branch] gh/malfet/599/base -> origin/gh/malfet/599/base 2025-12-04T09:17:10.6784304Z * [new branch] gh/malfet/599/head -> origin/gh/malfet/599/head 2025-12-04T09:17:10.6786460Z * [new branch] gh/malfet/599/orig -> origin/gh/malfet/599/orig 2025-12-04T09:17:10.6789395Z * [new branch] gh/malfet/600/base -> origin/gh/malfet/600/base 2025-12-04T09:17:10.6791512Z * [new branch] gh/malfet/600/head -> origin/gh/malfet/600/head 2025-12-04T09:17:10.6793700Z * [new branch] gh/malfet/600/orig -> origin/gh/malfet/600/orig 2025-12-04T09:17:10.6796824Z * [new branch] gh/malfet/601/base -> origin/gh/malfet/601/base 2025-12-04T09:17:10.6798935Z * [new branch] gh/malfet/601/head -> origin/gh/malfet/601/head 2025-12-04T09:17:10.6801058Z * [new branch] gh/malfet/601/orig -> origin/gh/malfet/601/orig 2025-12-04T09:17:10.6804079Z * [new branch] gh/malfet/602/base -> origin/gh/malfet/602/base 2025-12-04T09:17:10.6806132Z * [new branch] gh/malfet/602/head -> origin/gh/malfet/602/head 2025-12-04T09:17:10.6808299Z * [new branch] gh/malfet/602/orig -> origin/gh/malfet/602/orig 2025-12-04T09:17:10.6811104Z * [new branch] gh/malfet/603/base -> origin/gh/malfet/603/base 2025-12-04T09:17:10.6813163Z * [new branch] gh/malfet/603/head -> origin/gh/malfet/603/head 2025-12-04T09:17:10.6815258Z * [new branch] gh/malfet/603/orig -> origin/gh/malfet/603/orig 2025-12-04T09:17:10.6818097Z * [new branch] gh/malfet/604/base -> origin/gh/malfet/604/base 2025-12-04T09:17:10.6820258Z * [new branch] gh/malfet/604/head -> origin/gh/malfet/604/head 2025-12-04T09:17:10.6822403Z * [new branch] gh/malfet/604/orig -> origin/gh/malfet/604/orig 2025-12-04T09:17:10.6828047Z * [new branch] gh/malfet/605/base -> origin/gh/malfet/605/base 2025-12-04T09:17:10.6830117Z * [new branch] gh/malfet/605/head -> origin/gh/malfet/605/head 2025-12-04T09:17:10.6832294Z * [new branch] gh/malfet/605/orig -> origin/gh/malfet/605/orig 2025-12-04T09:17:10.6835220Z * [new branch] gh/malfet/606/base -> origin/gh/malfet/606/base 2025-12-04T09:17:10.6837331Z * [new branch] gh/malfet/606/head -> origin/gh/malfet/606/head 2025-12-04T09:17:10.6839500Z * [new branch] gh/malfet/606/orig -> origin/gh/malfet/606/orig 2025-12-04T09:17:10.6842340Z * [new branch] gh/malfet/607/base -> origin/gh/malfet/607/base 2025-12-04T09:17:10.6844385Z * [new branch] gh/malfet/607/head -> origin/gh/malfet/607/head 2025-12-04T09:17:10.6846654Z * [new branch] gh/malfet/607/orig -> origin/gh/malfet/607/orig 2025-12-04T09:17:10.6849511Z * [new branch] gh/malfet/608/base -> origin/gh/malfet/608/base 2025-12-04T09:17:10.6851625Z * [new branch] gh/malfet/608/head -> origin/gh/malfet/608/head 2025-12-04T09:17:10.6853794Z * [new branch] gh/malfet/608/orig -> origin/gh/malfet/608/orig 2025-12-04T09:17:10.6857346Z * [new branch] gh/malfet/609/base -> origin/gh/malfet/609/base 2025-12-04T09:17:10.6859443Z * [new branch] gh/malfet/609/head -> origin/gh/malfet/609/head 2025-12-04T09:17:10.6861565Z * [new branch] gh/malfet/609/orig -> origin/gh/malfet/609/orig 2025-12-04T09:17:10.6864805Z * [new branch] gh/malfet/610/base -> origin/gh/malfet/610/base 2025-12-04T09:17:10.6866865Z * [new branch] gh/malfet/610/head -> origin/gh/malfet/610/head 2025-12-04T09:17:10.6869038Z * [new branch] gh/malfet/610/orig -> origin/gh/malfet/610/orig 2025-12-04T09:17:10.6871980Z * [new branch] gh/malfet/611/base -> origin/gh/malfet/611/base 2025-12-04T09:17:10.6874137Z * [new branch] gh/malfet/611/head -> origin/gh/malfet/611/head 2025-12-04T09:17:10.6876272Z * [new branch] gh/malfet/611/orig -> origin/gh/malfet/611/orig 2025-12-04T09:17:10.6879039Z * [new branch] gh/malfet/612/base -> origin/gh/malfet/612/base 2025-12-04T09:17:10.6881155Z * [new branch] gh/malfet/612/head -> origin/gh/malfet/612/head 2025-12-04T09:17:10.6883274Z * [new branch] gh/malfet/612/orig -> origin/gh/malfet/612/orig 2025-12-04T09:17:10.6886377Z * [new branch] gh/malfet/64/base -> origin/gh/malfet/64/base 2025-12-04T09:17:10.6888460Z * [new branch] gh/malfet/64/head -> origin/gh/malfet/64/head 2025-12-04T09:17:10.6891873Z * [new branch] gh/manuelcandales/11/base -> origin/gh/manuelcandales/11/base 2025-12-04T09:17:10.6894025Z * [new branch] gh/manuelcandales/11/head -> origin/gh/manuelcandales/11/head 2025-12-04T09:17:10.6896127Z * [new branch] gh/manuelcandales/11/orig -> origin/gh/manuelcandales/11/orig 2025-12-04T09:17:10.6899620Z * [new branch] gh/markkm/1/base -> origin/gh/markkm/1/base 2025-12-04T09:17:10.6903238Z * [new branch] gh/masnesral/1/base -> origin/gh/masnesral/1/base 2025-12-04T09:17:10.6905422Z * [new branch] gh/masnesral/1/head -> origin/gh/masnesral/1/head 2025-12-04T09:17:10.6907617Z * [new branch] gh/masnesral/1/orig -> origin/gh/masnesral/1/orig 2025-12-04T09:17:10.6911611Z * [new branch] gh/mhorowitz/0/base -> origin/gh/mhorowitz/0/base 2025-12-04T09:17:10.6913732Z * [new branch] gh/mhorowitz/0/head -> origin/gh/mhorowitz/0/head 2025-12-04T09:17:10.6916441Z * [new branch] gh/mhorowitz/1/base -> origin/gh/mhorowitz/1/base 2025-12-04T09:17:10.6918606Z * [new branch] gh/mhorowitz/1/head -> origin/gh/mhorowitz/1/head 2025-12-04T09:17:10.6921340Z * [new branch] gh/mhorowitz/2/base -> origin/gh/mhorowitz/2/base 2025-12-04T09:17:10.6923428Z * [new branch] gh/mhorowitz/2/head -> origin/gh/mhorowitz/2/head 2025-12-04T09:17:10.6926399Z * [new branch] gh/mhorowitz/3/base -> origin/gh/mhorowitz/3/base 2025-12-04T09:17:10.6928431Z * [new branch] gh/mhorowitz/3/head -> origin/gh/mhorowitz/3/head 2025-12-04T09:17:10.6931117Z * [new branch] gh/mhorowitz/4/base -> origin/gh/mhorowitz/4/base 2025-12-04T09:17:10.6933261Z * [new branch] gh/mhorowitz/4/head -> origin/gh/mhorowitz/4/head 2025-12-04T09:17:10.6935980Z * [new branch] gh/mhorowitz/5/base -> origin/gh/mhorowitz/5/base 2025-12-04T09:17:10.6938005Z * [new branch] gh/mhorowitz/5/head -> origin/gh/mhorowitz/5/head 2025-12-04T09:17:10.6940673Z * [new branch] gh/mhorowitz/6/base -> origin/gh/mhorowitz/6/base 2025-12-04T09:17:10.6942825Z * [new branch] gh/mhorowitz/6/head -> origin/gh/mhorowitz/6/head 2025-12-04T09:17:10.6946314Z * [new branch] gh/mikaylagawarecki/234/base -> origin/gh/mikaylagawarecki/234/base 2025-12-04T09:17:10.6948420Z * [new branch] gh/mikaylagawarecki/234/head -> origin/gh/mikaylagawarecki/234/head 2025-12-04T09:17:10.6951177Z * [new branch] gh/mikaylagawarecki/235/base -> origin/gh/mikaylagawarecki/235/base 2025-12-04T09:17:10.6953387Z * [new branch] gh/mikaylagawarecki/235/head -> origin/gh/mikaylagawarecki/235/head 2025-12-04T09:17:10.6956048Z * [new branch] gh/mikaylagawarecki/236/base -> origin/gh/mikaylagawarecki/236/base 2025-12-04T09:17:10.6958407Z * [new branch] gh/mikaylagawarecki/236/head -> origin/gh/mikaylagawarecki/236/head 2025-12-04T09:17:10.6961275Z * [new branch] gh/mikaylagawarecki/237/base -> origin/gh/mikaylagawarecki/237/base 2025-12-04T09:17:10.6963310Z * [new branch] gh/mikaylagawarecki/237/head -> origin/gh/mikaylagawarecki/237/head 2025-12-04T09:17:10.6966219Z * [new branch] gh/mikaylagawarecki/238/base -> origin/gh/mikaylagawarecki/238/base 2025-12-04T09:17:10.6968395Z * [new branch] gh/mikaylagawarecki/238/head -> origin/gh/mikaylagawarecki/238/head 2025-12-04T09:17:10.6971236Z * [new branch] gh/mikaylagawarecki/336/base -> origin/gh/mikaylagawarecki/336/base 2025-12-04T09:17:10.6973359Z * [new branch] gh/mikaylagawarecki/336/head -> origin/gh/mikaylagawarecki/336/head 2025-12-04T09:17:10.6975500Z * [new branch] gh/mikaylagawarecki/336/orig -> origin/gh/mikaylagawarecki/336/orig 2025-12-04T09:17:10.6978450Z * [new branch] gh/mikaylagawarecki/341/base -> origin/gh/mikaylagawarecki/341/base 2025-12-04T09:17:10.6980519Z * [new branch] gh/mikaylagawarecki/341/head -> origin/gh/mikaylagawarecki/341/head 2025-12-04T09:17:10.6982793Z * [new branch] gh/mikaylagawarecki/341/orig -> origin/gh/mikaylagawarecki/341/orig 2025-12-04T09:17:10.6985893Z * [new branch] gh/mikaylagawarecki/342/base -> origin/gh/mikaylagawarecki/342/base 2025-12-04T09:17:10.6988106Z * [new branch] gh/mikaylagawarecki/342/head -> origin/gh/mikaylagawarecki/342/head 2025-12-04T09:17:10.6990354Z * [new branch] gh/mikaylagawarecki/342/orig -> origin/gh/mikaylagawarecki/342/orig 2025-12-04T09:17:10.6993318Z * [new branch] gh/mikaylagawarecki/345/base -> origin/gh/mikaylagawarecki/345/base 2025-12-04T09:17:10.6995331Z * [new branch] gh/mikaylagawarecki/345/head -> origin/gh/mikaylagawarecki/345/head 2025-12-04T09:17:10.6997559Z * [new branch] gh/mikaylagawarecki/345/orig -> origin/gh/mikaylagawarecki/345/orig 2025-12-04T09:17:10.7000577Z * [new branch] gh/mikaylagawarecki/346/base -> origin/gh/mikaylagawarecki/346/base 2025-12-04T09:17:10.7002681Z * [new branch] gh/mikaylagawarecki/346/head -> origin/gh/mikaylagawarecki/346/head 2025-12-04T09:17:10.7004910Z * [new branch] gh/mikaylagawarecki/346/orig -> origin/gh/mikaylagawarecki/346/orig 2025-12-04T09:17:10.7007887Z * [new branch] gh/mikaylagawarecki/347/base -> origin/gh/mikaylagawarecki/347/base 2025-12-04T09:17:10.7010018Z * [new branch] gh/mikaylagawarecki/347/head -> origin/gh/mikaylagawarecki/347/head 2025-12-04T09:17:10.7012091Z * [new branch] gh/mikaylagawarecki/347/orig -> origin/gh/mikaylagawarecki/347/orig 2025-12-04T09:17:10.7015021Z * [new branch] gh/mikaylagawarecki/350/base -> origin/gh/mikaylagawarecki/350/base 2025-12-04T09:17:10.7017208Z * [new branch] gh/mikaylagawarecki/350/head -> origin/gh/mikaylagawarecki/350/head 2025-12-04T09:17:10.7019826Z * [new branch] gh/mikaylagawarecki/350/orig -> origin/gh/mikaylagawarecki/350/orig 2025-12-04T09:17:10.7023231Z * [new branch] gh/mikaylagawarecki/351/base -> origin/gh/mikaylagawarecki/351/base 2025-12-04T09:17:10.7025401Z * [new branch] gh/mikaylagawarecki/351/head -> origin/gh/mikaylagawarecki/351/head 2025-12-04T09:17:10.7029224Z * [new branch] gh/mikaylagawarecki/351/orig -> origin/gh/mikaylagawarecki/351/orig 2025-12-04T09:17:10.7032251Z * [new branch] gh/mikaylagawarecki/352/base -> origin/gh/mikaylagawarecki/352/base 2025-12-04T09:17:10.7034469Z * [new branch] gh/mikaylagawarecki/352/head -> origin/gh/mikaylagawarecki/352/head 2025-12-04T09:17:10.7036736Z * [new branch] gh/mikaylagawarecki/352/orig -> origin/gh/mikaylagawarecki/352/orig 2025-12-04T09:17:10.7039656Z * [new branch] gh/mikaylagawarecki/353/base -> origin/gh/mikaylagawarecki/353/base 2025-12-04T09:17:10.7042041Z * [new branch] gh/mikaylagawarecki/353/head -> origin/gh/mikaylagawarecki/353/head 2025-12-04T09:17:10.7044198Z * [new branch] gh/mikaylagawarecki/353/orig -> origin/gh/mikaylagawarecki/353/orig 2025-12-04T09:17:10.7046914Z * [new branch] gh/mikaylagawarecki/354/base -> origin/gh/mikaylagawarecki/354/base 2025-12-04T09:17:10.7049151Z * [new branch] gh/mikaylagawarecki/354/head -> origin/gh/mikaylagawarecki/354/head 2025-12-04T09:17:10.7051304Z * [new branch] gh/mikaylagawarecki/354/orig -> origin/gh/mikaylagawarecki/354/orig 2025-12-04T09:17:10.7054683Z * [new branch] gh/mikaylagawarecki/356/base -> origin/gh/mikaylagawarecki/356/base 2025-12-04T09:17:10.7056819Z * [new branch] gh/mikaylagawarecki/356/head -> origin/gh/mikaylagawarecki/356/head 2025-12-04T09:17:10.7058917Z * [new branch] gh/mikaylagawarecki/356/orig -> origin/gh/mikaylagawarecki/356/orig 2025-12-04T09:17:10.7061687Z * [new branch] gh/mikaylagawarecki/357/base -> origin/gh/mikaylagawarecki/357/base 2025-12-04T09:17:10.7063983Z * [new branch] gh/mikaylagawarecki/357/head -> origin/gh/mikaylagawarecki/357/head 2025-12-04T09:17:10.7066107Z * [new branch] gh/mikaylagawarecki/357/orig -> origin/gh/mikaylagawarecki/357/orig 2025-12-04T09:17:10.7069162Z * [new branch] gh/mikaylagawarecki/359/base -> origin/gh/mikaylagawarecki/359/base 2025-12-04T09:17:10.7071341Z * [new branch] gh/mikaylagawarecki/359/head -> origin/gh/mikaylagawarecki/359/head 2025-12-04T09:17:10.7073444Z * [new branch] gh/mikaylagawarecki/359/orig -> origin/gh/mikaylagawarecki/359/orig 2025-12-04T09:17:10.7076378Z * [new branch] gh/mikaylagawarecki/360/base -> origin/gh/mikaylagawarecki/360/base 2025-12-04T09:17:10.7078661Z * [new branch] gh/mikaylagawarecki/360/head -> origin/gh/mikaylagawarecki/360/head 2025-12-04T09:17:10.7080761Z * [new branch] gh/mikaylagawarecki/360/orig -> origin/gh/mikaylagawarecki/360/orig 2025-12-04T09:17:10.7083698Z * [new branch] gh/mikaylagawarecki/361/base -> origin/gh/mikaylagawarecki/361/base 2025-12-04T09:17:10.7085817Z * [new branch] gh/mikaylagawarecki/361/head -> origin/gh/mikaylagawarecki/361/head 2025-12-04T09:17:10.7087998Z * [new branch] gh/mikaylagawarecki/361/orig -> origin/gh/mikaylagawarecki/361/orig 2025-12-04T09:17:10.7090963Z * [new branch] gh/mikaylagawarecki/362/base -> origin/gh/mikaylagawarecki/362/base 2025-12-04T09:17:10.7093208Z * [new branch] gh/mikaylagawarecki/362/head -> origin/gh/mikaylagawarecki/362/head 2025-12-04T09:17:10.7095341Z * [new branch] gh/mikaylagawarecki/362/orig -> origin/gh/mikaylagawarecki/362/orig 2025-12-04T09:17:10.7098544Z * [new branch] gh/mikaylagawarecki/363/base -> origin/gh/mikaylagawarecki/363/base 2025-12-04T09:17:10.7100876Z * [new branch] gh/mikaylagawarecki/363/head -> origin/gh/mikaylagawarecki/363/head 2025-12-04T09:17:10.7103064Z * [new branch] gh/mikaylagawarecki/363/orig -> origin/gh/mikaylagawarecki/363/orig 2025-12-04T09:17:10.7106447Z * [new branch] gh/mikaylagawarecki/364/base -> origin/gh/mikaylagawarecki/364/base 2025-12-04T09:17:10.7108661Z * [new branch] gh/mikaylagawarecki/364/head -> origin/gh/mikaylagawarecki/364/head 2025-12-04T09:17:10.7110907Z * [new branch] gh/mikaylagawarecki/364/orig -> origin/gh/mikaylagawarecki/364/orig 2025-12-04T09:17:10.7114010Z * [new branch] gh/mikaylagawarecki/365/base -> origin/gh/mikaylagawarecki/365/base 2025-12-04T09:17:10.7116333Z * [new branch] gh/mikaylagawarecki/365/head -> origin/gh/mikaylagawarecki/365/head 2025-12-04T09:17:10.7118540Z * [new branch] gh/mikaylagawarecki/365/orig -> origin/gh/mikaylagawarecki/365/orig 2025-12-04T09:17:10.7121517Z * [new branch] gh/mikaylagawarecki/366/base -> origin/gh/mikaylagawarecki/366/base 2025-12-04T09:17:10.7123586Z * [new branch] gh/mikaylagawarecki/366/head -> origin/gh/mikaylagawarecki/366/head 2025-12-04T09:17:10.7125998Z * [new branch] gh/mikaylagawarecki/366/orig -> origin/gh/mikaylagawarecki/366/orig 2025-12-04T09:17:10.7128939Z * [new branch] gh/mikaylagawarecki/367/base -> origin/gh/mikaylagawarecki/367/base 2025-12-04T09:17:10.7131087Z * [new branch] gh/mikaylagawarecki/367/head -> origin/gh/mikaylagawarecki/367/head 2025-12-04T09:17:10.7133275Z * [new branch] gh/mikaylagawarecki/367/orig -> origin/gh/mikaylagawarecki/367/orig 2025-12-04T09:17:10.7136237Z * [new branch] gh/mikaylagawarecki/368/base -> origin/gh/mikaylagawarecki/368/base 2025-12-04T09:17:10.7138520Z * [new branch] gh/mikaylagawarecki/368/head -> origin/gh/mikaylagawarecki/368/head 2025-12-04T09:17:10.7141727Z * [new branch] gh/mikaylagawarecki/368/orig -> origin/gh/mikaylagawarecki/368/orig 2025-12-04T09:17:10.7144685Z * [new branch] gh/mikaylagawarecki/369/base -> origin/gh/mikaylagawarecki/369/base 2025-12-04T09:17:10.7147791Z * [new branch] gh/mikaylagawarecki/369/head -> origin/gh/mikaylagawarecki/369/head 2025-12-04T09:17:10.7149890Z * [new branch] gh/mikaylagawarecki/369/orig -> origin/gh/mikaylagawarecki/369/orig 2025-12-04T09:17:10.7153033Z * [new branch] gh/mikaylagawarecki/370/base -> origin/gh/mikaylagawarecki/370/base 2025-12-04T09:17:10.7155126Z * [new branch] gh/mikaylagawarecki/370/head -> origin/gh/mikaylagawarecki/370/head 2025-12-04T09:17:10.7157264Z * [new branch] gh/mikaylagawarecki/370/orig -> origin/gh/mikaylagawarecki/370/orig 2025-12-04T09:17:10.7160602Z * [new branch] gh/mikaylagawarecki/371/base -> origin/gh/mikaylagawarecki/371/base 2025-12-04T09:17:10.7162699Z * [new branch] gh/mikaylagawarecki/371/head -> origin/gh/mikaylagawarecki/371/head 2025-12-04T09:17:10.7164835Z * [new branch] gh/mikaylagawarecki/371/orig -> origin/gh/mikaylagawarecki/371/orig 2025-12-04T09:17:10.7167867Z * [new branch] gh/mikaylagawarecki/372/base -> origin/gh/mikaylagawarecki/372/base 2025-12-04T09:17:10.7170195Z * [new branch] gh/mikaylagawarecki/372/head -> origin/gh/mikaylagawarecki/372/head 2025-12-04T09:17:10.7172348Z * [new branch] gh/mikaylagawarecki/372/orig -> origin/gh/mikaylagawarecki/372/orig 2025-12-04T09:17:10.7175186Z * [new branch] gh/mikaylagawarecki/373/base -> origin/gh/mikaylagawarecki/373/base 2025-12-04T09:17:10.7177367Z * [new branch] gh/mikaylagawarecki/373/head -> origin/gh/mikaylagawarecki/373/head 2025-12-04T09:17:10.7179968Z * [new branch] gh/mikaylagawarecki/373/orig -> origin/gh/mikaylagawarecki/373/orig 2025-12-04T09:17:10.7182982Z * [new branch] gh/mikaylagawarecki/374/base -> origin/gh/mikaylagawarecki/374/base 2025-12-04T09:17:10.7185201Z * [new branch] gh/mikaylagawarecki/374/head -> origin/gh/mikaylagawarecki/374/head 2025-12-04T09:17:10.7187453Z * [new branch] gh/mikaylagawarecki/374/orig -> origin/gh/mikaylagawarecki/374/orig 2025-12-04T09:17:10.7190388Z * [new branch] gh/mikaylagawarecki/375/base -> origin/gh/mikaylagawarecki/375/base 2025-12-04T09:17:10.7192606Z * [new branch] gh/mikaylagawarecki/375/head -> origin/gh/mikaylagawarecki/375/head 2025-12-04T09:17:10.7194766Z * [new branch] gh/mikaylagawarecki/375/orig -> origin/gh/mikaylagawarecki/375/orig 2025-12-04T09:17:10.7197784Z * [new branch] gh/mikaylagawarecki/376/base -> origin/gh/mikaylagawarecki/376/base 2025-12-04T09:17:10.7200261Z * [new branch] gh/mikaylagawarecki/376/head -> origin/gh/mikaylagawarecki/376/head 2025-12-04T09:17:10.7202273Z * [new branch] gh/mikaylagawarecki/376/orig -> origin/gh/mikaylagawarecki/376/orig 2025-12-04T09:17:10.7205362Z * [new branch] gh/mikaylagawarecki/377/base -> origin/gh/mikaylagawarecki/377/base 2025-12-04T09:17:10.7207587Z * [new branch] gh/mikaylagawarecki/377/head -> origin/gh/mikaylagawarecki/377/head 2025-12-04T09:17:10.7209763Z * [new branch] gh/mikaylagawarecki/377/orig -> origin/gh/mikaylagawarecki/377/orig 2025-12-04T09:17:10.7212726Z * [new branch] gh/mikaylagawarecki/378/base -> origin/gh/mikaylagawarecki/378/base 2025-12-04T09:17:10.7215449Z * [new branch] gh/mikaylagawarecki/378/head -> origin/gh/mikaylagawarecki/378/head 2025-12-04T09:17:10.7217639Z * [new branch] gh/mikaylagawarecki/378/orig -> origin/gh/mikaylagawarecki/378/orig 2025-12-04T09:17:10.7220617Z * [new branch] gh/mikaylagawarecki/379/base -> origin/gh/mikaylagawarecki/379/base 2025-12-04T09:17:10.7222847Z * [new branch] gh/mikaylagawarecki/379/head -> origin/gh/mikaylagawarecki/379/head 2025-12-04T09:17:10.7225126Z * [new branch] gh/mikaylagawarecki/379/orig -> origin/gh/mikaylagawarecki/379/orig 2025-12-04T09:17:10.7228049Z * [new branch] gh/mikaylagawarecki/380/base -> origin/gh/mikaylagawarecki/380/base 2025-12-04T09:17:10.7230262Z * [new branch] gh/mikaylagawarecki/380/head -> origin/gh/mikaylagawarecki/380/head 2025-12-04T09:17:10.7232458Z * [new branch] gh/mikaylagawarecki/380/orig -> origin/gh/mikaylagawarecki/380/orig 2025-12-04T09:17:10.7235234Z * [new branch] gh/mikaylagawarecki/381/base -> origin/gh/mikaylagawarecki/381/base 2025-12-04T09:17:10.7237381Z * [new branch] gh/mikaylagawarecki/381/head -> origin/gh/mikaylagawarecki/381/head 2025-12-04T09:17:10.7239541Z * [new branch] gh/mikaylagawarecki/381/orig -> origin/gh/mikaylagawarecki/381/orig 2025-12-04T09:17:10.7242296Z * [new branch] gh/mikaylagawarecki/382/base -> origin/gh/mikaylagawarecki/382/base 2025-12-04T09:17:10.7244506Z * [new branch] gh/mikaylagawarecki/382/head -> origin/gh/mikaylagawarecki/382/head 2025-12-04T09:17:10.7246607Z * [new branch] gh/mikaylagawarecki/382/orig -> origin/gh/mikaylagawarecki/382/orig 2025-12-04T09:17:10.7249637Z * [new branch] gh/mikaylagawarecki/383/base -> origin/gh/mikaylagawarecki/383/base 2025-12-04T09:17:10.7251886Z * [new branch] gh/mikaylagawarecki/383/head -> origin/gh/mikaylagawarecki/383/head 2025-12-04T09:17:10.7254140Z * [new branch] gh/mikaylagawarecki/383/orig -> origin/gh/mikaylagawarecki/383/orig 2025-12-04T09:17:10.7257050Z * [new branch] gh/mikaylagawarecki/384/base -> origin/gh/mikaylagawarecki/384/base 2025-12-04T09:17:10.7259363Z * [new branch] gh/mikaylagawarecki/384/head -> origin/gh/mikaylagawarecki/384/head 2025-12-04T09:17:10.7261504Z * [new branch] gh/mikaylagawarecki/384/orig -> origin/gh/mikaylagawarecki/384/orig 2025-12-04T09:17:10.7264674Z * [new branch] gh/mikaylagawarecki/385/base -> origin/gh/mikaylagawarecki/385/base 2025-12-04T09:17:10.7266865Z * [new branch] gh/mikaylagawarecki/385/head -> origin/gh/mikaylagawarecki/385/head 2025-12-04T09:17:10.7268982Z * [new branch] gh/mikaylagawarecki/385/orig -> origin/gh/mikaylagawarecki/385/orig 2025-12-04T09:17:10.7272001Z * [new branch] gh/mikaylagawarecki/386/base -> origin/gh/mikaylagawarecki/386/base 2025-12-04T09:17:10.7274108Z * [new branch] gh/mikaylagawarecki/386/head -> origin/gh/mikaylagawarecki/386/head 2025-12-04T09:17:10.7276284Z * [new branch] gh/mikaylagawarecki/386/orig -> origin/gh/mikaylagawarecki/386/orig 2025-12-04T09:17:10.7279356Z * [new branch] gh/mikaylagawarecki/387/base -> origin/gh/mikaylagawarecki/387/base 2025-12-04T09:17:10.7281395Z * [new branch] gh/mikaylagawarecki/387/head -> origin/gh/mikaylagawarecki/387/head 2025-12-04T09:17:10.7283483Z * [new branch] gh/mikaylagawarecki/387/orig -> origin/gh/mikaylagawarecki/387/orig 2025-12-04T09:17:10.7286440Z * [new branch] gh/mikaylagawarecki/388/base -> origin/gh/mikaylagawarecki/388/base 2025-12-04T09:17:10.7288727Z * [new branch] gh/mikaylagawarecki/388/head -> origin/gh/mikaylagawarecki/388/head 2025-12-04T09:17:10.7290919Z * [new branch] gh/mikaylagawarecki/388/orig -> origin/gh/mikaylagawarecki/388/orig 2025-12-04T09:17:10.7293925Z * [new branch] gh/mikaylagawarecki/389/base -> origin/gh/mikaylagawarecki/389/base 2025-12-04T09:17:10.7296087Z * [new branch] gh/mikaylagawarecki/389/head -> origin/gh/mikaylagawarecki/389/head 2025-12-04T09:17:10.7298273Z * [new branch] gh/mikaylagawarecki/389/orig -> origin/gh/mikaylagawarecki/389/orig 2025-12-04T09:17:10.7301252Z * [new branch] gh/mikaylagawarecki/390/base -> origin/gh/mikaylagawarecki/390/base 2025-12-04T09:17:10.7303437Z * [new branch] gh/mikaylagawarecki/390/head -> origin/gh/mikaylagawarecki/390/head 2025-12-04T09:17:10.7305579Z * [new branch] gh/mikaylagawarecki/390/orig -> origin/gh/mikaylagawarecki/390/orig 2025-12-04T09:17:10.7308664Z * [new branch] gh/mikaylagawarecki/391/base -> origin/gh/mikaylagawarecki/391/base 2025-12-04T09:17:10.7310886Z * [new branch] gh/mikaylagawarecki/391/head -> origin/gh/mikaylagawarecki/391/head 2025-12-04T09:17:10.7313095Z * [new branch] gh/mikaylagawarecki/391/orig -> origin/gh/mikaylagawarecki/391/orig 2025-12-04T09:17:10.7316093Z * [new branch] gh/mikaylagawarecki/392/base -> origin/gh/mikaylagawarecki/392/base 2025-12-04T09:17:10.7318393Z * [new branch] gh/mikaylagawarecki/392/head -> origin/gh/mikaylagawarecki/392/head 2025-12-04T09:17:10.7320521Z * [new branch] gh/mikaylagawarecki/392/orig -> origin/gh/mikaylagawarecki/392/orig 2025-12-04T09:17:10.7323889Z * [new branch] gh/mlazos/41/base -> origin/gh/mlazos/41/base 2025-12-04T09:17:10.7326309Z * [new branch] gh/mlazos/41/head -> origin/gh/mlazos/41/head 2025-12-04T09:17:10.7328472Z * [new branch] gh/mlazos/41/orig -> origin/gh/mlazos/41/orig 2025-12-04T09:17:10.7331433Z * [new branch] gh/mlazos/42/base -> origin/gh/mlazos/42/base 2025-12-04T09:17:10.7333518Z * [new branch] gh/mlazos/42/head -> origin/gh/mlazos/42/head 2025-12-04T09:17:10.7335637Z * [new branch] gh/mlazos/42/orig -> origin/gh/mlazos/42/orig 2025-12-04T09:17:10.7338372Z * [new branch] gh/mlazos/43/base -> origin/gh/mlazos/43/base 2025-12-04T09:17:10.7340991Z * [new branch] gh/mlazos/43/head -> origin/gh/mlazos/43/head 2025-12-04T09:17:10.7343200Z * [new branch] gh/mlazos/43/orig -> origin/gh/mlazos/43/orig 2025-12-04T09:17:10.7346024Z * [new branch] gh/mlazos/44/base -> origin/gh/mlazos/44/base 2025-12-04T09:17:10.7348194Z * [new branch] gh/mlazos/44/head -> origin/gh/mlazos/44/head 2025-12-04T09:17:10.7350376Z * [new branch] gh/mlazos/44/orig -> origin/gh/mlazos/44/orig 2025-12-04T09:17:10.7353175Z * [new branch] gh/mlazos/47/base -> origin/gh/mlazos/47/base 2025-12-04T09:17:10.7355291Z * [new branch] gh/mlazos/47/head -> origin/gh/mlazos/47/head 2025-12-04T09:17:10.7357380Z * [new branch] gh/mlazos/47/orig -> origin/gh/mlazos/47/orig 2025-12-04T09:17:10.7360803Z * [new branch] gh/mlazos/48/base -> origin/gh/mlazos/48/base 2025-12-04T09:17:10.7369757Z * [new branch] gh/mlazos/48/head -> origin/gh/mlazos/48/head 2025-12-04T09:17:10.7370312Z * [new branch] gh/mlazos/48/orig -> origin/gh/mlazos/48/orig 2025-12-04T09:17:10.7370833Z * [new branch] gh/mlazos/49/base -> origin/gh/mlazos/49/base 2025-12-04T09:17:10.7371502Z * [new branch] gh/mlazos/49/head -> origin/gh/mlazos/49/head 2025-12-04T09:17:10.7372029Z * [new branch] gh/mlazos/49/orig -> origin/gh/mlazos/49/orig 2025-12-04T09:17:10.7374294Z * [new branch] gh/mlazos/50/base -> origin/gh/mlazos/50/base 2025-12-04T09:17:10.7376287Z * [new branch] gh/mlazos/50/head -> origin/gh/mlazos/50/head 2025-12-04T09:17:10.7378464Z * [new branch] gh/mlazos/50/orig -> origin/gh/mlazos/50/orig 2025-12-04T09:17:10.7381653Z * [new branch] gh/mlazos/51/base -> origin/gh/mlazos/51/base 2025-12-04T09:17:10.7383910Z * [new branch] gh/mlazos/51/head -> origin/gh/mlazos/51/head 2025-12-04T09:17:10.7385995Z * [new branch] gh/mlazos/51/orig -> origin/gh/mlazos/51/orig 2025-12-04T09:17:10.7388758Z * [new branch] gh/mlazos/52/base -> origin/gh/mlazos/52/base 2025-12-04T09:17:10.7390854Z * [new branch] gh/mlazos/52/head -> origin/gh/mlazos/52/head 2025-12-04T09:17:10.7392943Z * [new branch] gh/mlazos/52/orig -> origin/gh/mlazos/52/orig 2025-12-04T09:17:10.7395729Z * [new branch] gh/mlazos/53/base -> origin/gh/mlazos/53/base 2025-12-04T09:17:10.7397834Z * [new branch] gh/mlazos/53/head -> origin/gh/mlazos/53/head 2025-12-04T09:17:10.7400044Z * [new branch] gh/mlazos/53/orig -> origin/gh/mlazos/53/orig 2025-12-04T09:17:10.7402873Z * [new branch] gh/mlazos/54/base -> origin/gh/mlazos/54/base 2025-12-04T09:17:10.7404917Z * [new branch] gh/mlazos/54/head -> origin/gh/mlazos/54/head 2025-12-04T09:17:10.7407079Z * [new branch] gh/mlazos/54/orig -> origin/gh/mlazos/54/orig 2025-12-04T09:17:10.7409807Z * [new branch] gh/mlazos/55/base -> origin/gh/mlazos/55/base 2025-12-04T09:17:10.7411917Z * [new branch] gh/mlazos/55/head -> origin/gh/mlazos/55/head 2025-12-04T09:17:10.7414014Z * [new branch] gh/mlazos/55/orig -> origin/gh/mlazos/55/orig 2025-12-04T09:17:10.7416861Z * [new branch] gh/mlazos/56/base -> origin/gh/mlazos/56/base 2025-12-04T09:17:10.7418961Z * [new branch] gh/mlazos/56/head -> origin/gh/mlazos/56/head 2025-12-04T09:17:10.7421095Z * [new branch] gh/mlazos/56/orig -> origin/gh/mlazos/56/orig 2025-12-04T09:17:10.7424029Z * [new branch] gh/mlazos/57/base -> origin/gh/mlazos/57/base 2025-12-04T09:17:10.7428282Z * [new branch] gh/mlazos/57/head -> origin/gh/mlazos/57/head 2025-12-04T09:17:10.7430305Z * [new branch] gh/mlazos/57/orig -> origin/gh/mlazos/57/orig 2025-12-04T09:17:10.7433305Z * [new branch] gh/mlazos/58/base -> origin/gh/mlazos/58/base 2025-12-04T09:17:10.7435356Z * [new branch] gh/mlazos/58/head -> origin/gh/mlazos/58/head 2025-12-04T09:17:10.7437530Z * [new branch] gh/mlazos/58/orig -> origin/gh/mlazos/58/orig 2025-12-04T09:17:10.7440286Z * [new branch] gh/mlazos/59/base -> origin/gh/mlazos/59/base 2025-12-04T09:17:10.7442370Z * [new branch] gh/mlazos/59/head -> origin/gh/mlazos/59/head 2025-12-04T09:17:10.7444398Z * [new branch] gh/mlazos/59/orig -> origin/gh/mlazos/59/orig 2025-12-04T09:17:10.7447257Z * [new branch] gh/mlazos/60/base -> origin/gh/mlazos/60/base 2025-12-04T09:17:10.7449529Z * [new branch] gh/mlazos/60/head -> origin/gh/mlazos/60/head 2025-12-04T09:17:10.7451467Z * [new branch] gh/mlazos/60/orig -> origin/gh/mlazos/60/orig 2025-12-04T09:17:10.7454763Z * [new branch] gh/mlazos/61/base -> origin/gh/mlazos/61/base 2025-12-04T09:17:10.7456901Z * [new branch] gh/mlazos/61/head -> origin/gh/mlazos/61/head 2025-12-04T09:17:10.7459471Z * [new branch] gh/mlazos/61/orig -> origin/gh/mlazos/61/orig 2025-12-04T09:17:10.7462447Z * [new branch] gh/mlazos/62/base -> origin/gh/mlazos/62/base 2025-12-04T09:17:10.7464741Z * [new branch] gh/mlazos/62/head -> origin/gh/mlazos/62/head 2025-12-04T09:17:10.7466894Z * [new branch] gh/mlazos/62/orig -> origin/gh/mlazos/62/orig 2025-12-04T09:17:10.7469967Z * [new branch] gh/mlazos/63/base -> origin/gh/mlazos/63/base 2025-12-04T09:17:10.7472106Z * [new branch] gh/mlazos/63/head -> origin/gh/mlazos/63/head 2025-12-04T09:17:10.7474297Z * [new branch] gh/mlazos/63/orig -> origin/gh/mlazos/63/orig 2025-12-04T09:17:10.7477567Z * [new branch] gh/mlazos/64/base -> origin/gh/mlazos/64/base 2025-12-04T09:17:10.7479633Z * [new branch] gh/mlazos/64/head -> origin/gh/mlazos/64/head 2025-12-04T09:17:10.7481755Z * [new branch] gh/mlazos/64/orig -> origin/gh/mlazos/64/orig 2025-12-04T09:17:10.7484628Z * [new branch] gh/mlazos/65/base -> origin/gh/mlazos/65/base 2025-12-04T09:17:10.7486726Z * [new branch] gh/mlazos/65/head -> origin/gh/mlazos/65/head 2025-12-04T09:17:10.7488843Z * [new branch] gh/mlazos/65/orig -> origin/gh/mlazos/65/orig 2025-12-04T09:17:10.7491822Z * [new branch] gh/mlazos/66/base -> origin/gh/mlazos/66/base 2025-12-04T09:17:10.7493971Z * [new branch] gh/mlazos/66/head -> origin/gh/mlazos/66/head 2025-12-04T09:17:10.7496052Z * [new branch] gh/mlazos/66/orig -> origin/gh/mlazos/66/orig 2025-12-04T09:17:10.7498931Z * [new branch] gh/mlazos/67/base -> origin/gh/mlazos/67/base 2025-12-04T09:17:10.7501074Z * [new branch] gh/mlazos/67/head -> origin/gh/mlazos/67/head 2025-12-04T09:17:10.7503283Z * [new branch] gh/mlazos/67/orig -> origin/gh/mlazos/67/orig 2025-12-04T09:17:10.7506217Z * [new branch] gh/mlazos/68/base -> origin/gh/mlazos/68/base 2025-12-04T09:17:10.7508339Z * [new branch] gh/mlazos/68/head -> origin/gh/mlazos/68/head 2025-12-04T09:17:10.7510434Z * [new branch] gh/mlazos/68/orig -> origin/gh/mlazos/68/orig 2025-12-04T09:17:10.7513392Z * [new branch] gh/mlazos/69/base -> origin/gh/mlazos/69/base 2025-12-04T09:17:10.7515510Z * [new branch] gh/mlazos/69/head -> origin/gh/mlazos/69/head 2025-12-04T09:17:10.7517597Z * [new branch] gh/mlazos/69/orig -> origin/gh/mlazos/69/orig 2025-12-04T09:17:10.7520560Z * [new branch] gh/mlazos/70/base -> origin/gh/mlazos/70/base 2025-12-04T09:17:10.7522657Z * [new branch] gh/mlazos/70/head -> origin/gh/mlazos/70/head 2025-12-04T09:17:10.7525166Z * [new branch] gh/mlazos/70/orig -> origin/gh/mlazos/70/orig 2025-12-04T09:17:10.7527969Z * [new branch] gh/mlazos/71/base -> origin/gh/mlazos/71/base 2025-12-04T09:17:10.7530058Z * [new branch] gh/mlazos/71/head -> origin/gh/mlazos/71/head 2025-12-04T09:17:10.7532715Z * [new branch] gh/mlazos/71/orig -> origin/gh/mlazos/71/orig 2025-12-04T09:17:10.7536278Z * [new branch] gh/mlazos/72/base -> origin/gh/mlazos/72/base 2025-12-04T09:17:10.7537572Z * [new branch] gh/mlazos/72/head -> origin/gh/mlazos/72/head 2025-12-04T09:17:10.7539776Z * [new branch] gh/mlazos/72/orig -> origin/gh/mlazos/72/orig 2025-12-04T09:17:10.7542764Z * [new branch] gh/mlazos/73/base -> origin/gh/mlazos/73/base 2025-12-04T09:17:10.7544991Z * [new branch] gh/mlazos/73/head -> origin/gh/mlazos/73/head 2025-12-04T09:17:10.7547138Z * [new branch] gh/mlazos/73/orig -> origin/gh/mlazos/73/orig 2025-12-04T09:17:10.7550697Z * [new branch] gh/mrmiywj/1/base -> origin/gh/mrmiywj/1/base 2025-12-04T09:17:10.7552908Z * [new branch] gh/mrmiywj/1/head -> origin/gh/mrmiywj/1/head 2025-12-04T09:17:10.7556315Z * [new branch] gh/muchulee8/73/base -> origin/gh/muchulee8/73/base 2025-12-04T09:17:10.7558658Z * [new branch] gh/muchulee8/73/head -> origin/gh/muchulee8/73/head 2025-12-04T09:17:10.7560864Z * [new branch] gh/muchulee8/73/orig -> origin/gh/muchulee8/73/orig 2025-12-04T09:17:10.7564231Z * [new branch] gh/naveenthangudu/1/base -> origin/gh/naveenthangudu/1/base 2025-12-04T09:17:10.7566414Z * [new branch] gh/naveenthangudu/1/head -> origin/gh/naveenthangudu/1/head 2025-12-04T09:17:10.7568667Z * [new branch] gh/naveenthangudu/1/orig -> origin/gh/naveenthangudu/1/orig 2025-12-04T09:17:10.7571503Z * [new branch] gh/naveenthangudu/2/base -> origin/gh/naveenthangudu/2/base 2025-12-04T09:17:10.7573594Z * [new branch] gh/naveenthangudu/2/head -> origin/gh/naveenthangudu/2/head 2025-12-04T09:17:10.7575830Z * [new branch] gh/naveenthangudu/2/orig -> origin/gh/naveenthangudu/2/orig 2025-12-04T09:17:10.7578545Z * [new branch] gh/naveenthangudu/3/base -> origin/gh/naveenthangudu/3/base 2025-12-04T09:17:10.7580843Z * [new branch] gh/naveenthangudu/3/head -> origin/gh/naveenthangudu/3/head 2025-12-04T09:17:10.7583122Z * [new branch] gh/naveenthangudu/3/orig -> origin/gh/naveenthangudu/3/orig 2025-12-04T09:17:10.7585968Z * [new branch] gh/naveenthangudu/4/base -> origin/gh/naveenthangudu/4/base 2025-12-04T09:17:10.7588035Z * [new branch] gh/naveenthangudu/4/head -> origin/gh/naveenthangudu/4/head 2025-12-04T09:17:10.7590348Z * [new branch] gh/naveenthangudu/4/orig -> origin/gh/naveenthangudu/4/orig 2025-12-04T09:17:10.7593130Z * [new branch] gh/naveenthangudu/5/base -> origin/gh/naveenthangudu/5/base 2025-12-04T09:17:10.7595235Z * [new branch] gh/naveenthangudu/5/head -> origin/gh/naveenthangudu/5/head 2025-12-04T09:17:10.7597563Z * [new branch] gh/naveenthangudu/5/orig -> origin/gh/naveenthangudu/5/orig 2025-12-04T09:17:10.7600414Z * [new branch] gh/naveenthangudu/6/base -> origin/gh/naveenthangudu/6/base 2025-12-04T09:17:10.7602501Z * [new branch] gh/naveenthangudu/6/head -> origin/gh/naveenthangudu/6/head 2025-12-04T09:17:10.7604649Z * [new branch] gh/naveenthangudu/6/orig -> origin/gh/naveenthangudu/6/orig 2025-12-04T09:17:10.7607434Z * [new branch] gh/naveenthangudu/7/base -> origin/gh/naveenthangudu/7/base 2025-12-04T09:17:10.7609649Z * [new branch] gh/naveenthangudu/7/head -> origin/gh/naveenthangudu/7/head 2025-12-04T09:17:10.7611764Z * [new branch] gh/naveenthangudu/7/orig -> origin/gh/naveenthangudu/7/orig 2025-12-04T09:17:10.7614934Z * [new branch] gh/naveenthangudu/8/base -> origin/gh/naveenthangudu/8/base 2025-12-04T09:17:10.7617142Z * [new branch] gh/naveenthangudu/8/head -> origin/gh/naveenthangudu/8/head 2025-12-04T09:17:10.7619253Z * [new branch] gh/naveenthangudu/8/orig -> origin/gh/naveenthangudu/8/orig 2025-12-04T09:17:10.7622268Z * [new branch] gh/naveenthangudu/9/base -> origin/gh/naveenthangudu/9/base 2025-12-04T09:17:10.7624452Z * [new branch] gh/naveenthangudu/9/head -> origin/gh/naveenthangudu/9/head 2025-12-04T09:17:10.7626698Z * [new branch] gh/naveenthangudu/9/orig -> origin/gh/naveenthangudu/9/orig 2025-12-04T09:17:10.7629970Z * [new branch] gh/nikitaved/1/base -> origin/gh/nikitaved/1/base 2025-12-04T09:17:10.7632209Z * [new branch] gh/nikitaved/1/head -> origin/gh/nikitaved/1/head 2025-12-04T09:17:10.7634328Z * [new branch] gh/nikitaved/1/orig -> origin/gh/nikitaved/1/orig 2025-12-04T09:17:10.7637197Z * [new branch] gh/nikitaved/10/base -> origin/gh/nikitaved/10/base 2025-12-04T09:17:10.7639349Z * [new branch] gh/nikitaved/10/head -> origin/gh/nikitaved/10/head 2025-12-04T09:17:10.7641450Z * [new branch] gh/nikitaved/10/orig -> origin/gh/nikitaved/10/orig 2025-12-04T09:17:10.7644198Z * [new branch] gh/nikitaved/11/base -> origin/gh/nikitaved/11/base 2025-12-04T09:17:10.7646324Z * [new branch] gh/nikitaved/11/head -> origin/gh/nikitaved/11/head 2025-12-04T09:17:10.7648448Z * [new branch] gh/nikitaved/11/orig -> origin/gh/nikitaved/11/orig 2025-12-04T09:17:10.7651228Z * [new branch] gh/nikitaved/12/base -> origin/gh/nikitaved/12/base 2025-12-04T09:17:10.7653303Z * [new branch] gh/nikitaved/12/head -> origin/gh/nikitaved/12/head 2025-12-04T09:17:10.7655404Z * [new branch] gh/nikitaved/12/orig -> origin/gh/nikitaved/12/orig 2025-12-04T09:17:10.7658174Z * [new branch] gh/nikitaved/13/base -> origin/gh/nikitaved/13/base 2025-12-04T09:17:10.7660381Z * [new branch] gh/nikitaved/13/head -> origin/gh/nikitaved/13/head 2025-12-04T09:17:10.7662499Z * [new branch] gh/nikitaved/13/orig -> origin/gh/nikitaved/13/orig 2025-12-04T09:17:10.7665666Z * [new branch] gh/nikitaved/14/base -> origin/gh/nikitaved/14/base 2025-12-04T09:17:10.7667829Z * [new branch] gh/nikitaved/14/head -> origin/gh/nikitaved/14/head 2025-12-04T09:17:10.7669980Z * [new branch] gh/nikitaved/14/orig -> origin/gh/nikitaved/14/orig 2025-12-04T09:17:10.7672672Z * [new branch] gh/nikitaved/15/base -> origin/gh/nikitaved/15/base 2025-12-04T09:17:10.7674760Z * [new branch] gh/nikitaved/15/head -> origin/gh/nikitaved/15/head 2025-12-04T09:17:10.7676873Z * [new branch] gh/nikitaved/15/orig -> origin/gh/nikitaved/15/orig 2025-12-04T09:17:10.7679724Z * [new branch] gh/nikitaved/16/base -> origin/gh/nikitaved/16/base 2025-12-04T09:17:10.7681809Z * [new branch] gh/nikitaved/16/head -> origin/gh/nikitaved/16/head 2025-12-04T09:17:10.7683966Z * [new branch] gh/nikitaved/16/orig -> origin/gh/nikitaved/16/orig 2025-12-04T09:17:10.7686800Z * [new branch] gh/nikitaved/2/base -> origin/gh/nikitaved/2/base 2025-12-04T09:17:10.7689005Z * [new branch] gh/nikitaved/2/head -> origin/gh/nikitaved/2/head 2025-12-04T09:17:10.7691595Z * [new branch] gh/nikitaved/2/orig -> origin/gh/nikitaved/2/orig 2025-12-04T09:17:10.7694394Z * [new branch] gh/nikitaved/4/base -> origin/gh/nikitaved/4/base 2025-12-04T09:17:10.7696443Z * [new branch] gh/nikitaved/4/head -> origin/gh/nikitaved/4/head 2025-12-04T09:17:10.7698523Z * [new branch] gh/nikitaved/4/orig -> origin/gh/nikitaved/4/orig 2025-12-04T09:17:10.7701368Z * [new branch] gh/nikitaved/5/base -> origin/gh/nikitaved/5/base 2025-12-04T09:17:10.7703610Z * [new branch] gh/nikitaved/5/head -> origin/gh/nikitaved/5/head 2025-12-04T09:17:10.7705949Z * [new branch] gh/nikitaved/5/orig -> origin/gh/nikitaved/5/orig 2025-12-04T09:17:10.7708555Z * [new branch] gh/nikitaved/6/base -> origin/gh/nikitaved/6/base 2025-12-04T09:17:10.7711059Z * [new branch] gh/nikitaved/6/head -> origin/gh/nikitaved/6/head 2025-12-04T09:17:10.7713212Z * [new branch] gh/nikitaved/6/orig -> origin/gh/nikitaved/6/orig 2025-12-04T09:17:10.7716483Z * [new branch] gh/nikitaved/8/base -> origin/gh/nikitaved/8/base 2025-12-04T09:17:10.7718689Z * [new branch] gh/nikitaved/8/head -> origin/gh/nikitaved/8/head 2025-12-04T09:17:10.7720799Z * [new branch] gh/nikitaved/8/orig -> origin/gh/nikitaved/8/orig 2025-12-04T09:17:10.7723691Z * [new branch] gh/nikitaved/9/base -> origin/gh/nikitaved/9/base 2025-12-04T09:17:10.7726089Z * [new branch] gh/nikitaved/9/head -> origin/gh/nikitaved/9/head 2025-12-04T09:17:10.7728162Z * [new branch] gh/nikitaved/9/orig -> origin/gh/nikitaved/9/orig 2025-12-04T09:17:10.7731441Z * [new branch] gh/oulgen/10/base -> origin/gh/oulgen/10/base 2025-12-04T09:17:10.7733606Z * [new branch] gh/oulgen/10/head -> origin/gh/oulgen/10/head 2025-12-04T09:17:10.7735694Z * [new branch] gh/oulgen/10/orig -> origin/gh/oulgen/10/orig 2025-12-04T09:17:10.7738452Z * [new branch] gh/oulgen/11/base -> origin/gh/oulgen/11/base 2025-12-04T09:17:10.7740538Z * [new branch] gh/oulgen/11/head -> origin/gh/oulgen/11/head 2025-12-04T09:17:10.7742704Z * [new branch] gh/oulgen/11/orig -> origin/gh/oulgen/11/orig 2025-12-04T09:17:10.7745658Z * [new branch] gh/oulgen/12/base -> origin/gh/oulgen/12/base 2025-12-04T09:17:10.7747726Z * [new branch] gh/oulgen/12/head -> origin/gh/oulgen/12/head 2025-12-04T09:17:10.7749847Z * [new branch] gh/oulgen/12/orig -> origin/gh/oulgen/12/orig 2025-12-04T09:17:10.7752697Z * [new branch] gh/oulgen/13/base -> origin/gh/oulgen/13/base 2025-12-04T09:17:10.7754824Z * [new branch] gh/oulgen/13/head -> origin/gh/oulgen/13/head 2025-12-04T09:17:10.7756959Z * [new branch] gh/oulgen/13/orig -> origin/gh/oulgen/13/orig 2025-12-04T09:17:10.7759790Z * [new branch] gh/oulgen/14/base -> origin/gh/oulgen/14/base 2025-12-04T09:17:10.7761850Z * [new branch] gh/oulgen/14/head -> origin/gh/oulgen/14/head 2025-12-04T09:17:10.7764334Z * [new branch] gh/oulgen/14/orig -> origin/gh/oulgen/14/orig 2025-12-04T09:17:10.7766864Z * [new branch] gh/oulgen/15/base -> origin/gh/oulgen/15/base 2025-12-04T09:17:10.7768992Z * [new branch] gh/oulgen/15/head -> origin/gh/oulgen/15/head 2025-12-04T09:17:10.7771129Z * [new branch] gh/oulgen/15/orig -> origin/gh/oulgen/15/orig 2025-12-04T09:17:10.7773939Z * [new branch] gh/oulgen/16/base -> origin/gh/oulgen/16/base 2025-12-04T09:17:10.7776005Z * [new branch] gh/oulgen/16/head -> origin/gh/oulgen/16/head 2025-12-04T09:17:10.7778046Z * [new branch] gh/oulgen/16/orig -> origin/gh/oulgen/16/orig 2025-12-04T09:17:10.7780841Z * [new branch] gh/oulgen/17/base -> origin/gh/oulgen/17/base 2025-12-04T09:17:10.7783060Z * [new branch] gh/oulgen/17/head -> origin/gh/oulgen/17/head 2025-12-04T09:17:10.7785234Z * [new branch] gh/oulgen/17/orig -> origin/gh/oulgen/17/orig 2025-12-04T09:17:10.7787950Z * [new branch] gh/oulgen/18/base -> origin/gh/oulgen/18/base 2025-12-04T09:17:10.7790067Z * [new branch] gh/oulgen/18/head -> origin/gh/oulgen/18/head 2025-12-04T09:17:10.7792384Z * [new branch] gh/oulgen/18/orig -> origin/gh/oulgen/18/orig 2025-12-04T09:17:10.7795045Z * [new branch] gh/oulgen/19/base -> origin/gh/oulgen/19/base 2025-12-04T09:17:10.7797208Z * [new branch] gh/oulgen/19/head -> origin/gh/oulgen/19/head 2025-12-04T09:17:10.7799404Z * [new branch] gh/oulgen/19/orig -> origin/gh/oulgen/19/orig 2025-12-04T09:17:10.7802266Z * [new branch] gh/oulgen/20/base -> origin/gh/oulgen/20/base 2025-12-04T09:17:10.7804422Z * [new branch] gh/oulgen/20/head -> origin/gh/oulgen/20/head 2025-12-04T09:17:10.7806490Z * [new branch] gh/oulgen/20/orig -> origin/gh/oulgen/20/orig 2025-12-04T09:17:10.7809219Z * [new branch] gh/oulgen/21/base -> origin/gh/oulgen/21/base 2025-12-04T09:17:10.7811281Z * [new branch] gh/oulgen/21/head -> origin/gh/oulgen/21/head 2025-12-04T09:17:10.7813442Z * [new branch] gh/oulgen/21/orig -> origin/gh/oulgen/21/orig 2025-12-04T09:17:10.7816217Z * [new branch] gh/oulgen/22/base -> origin/gh/oulgen/22/base 2025-12-04T09:17:10.7818282Z * [new branch] gh/oulgen/22/head -> origin/gh/oulgen/22/head 2025-12-04T09:17:10.7820481Z * [new branch] gh/oulgen/22/orig -> origin/gh/oulgen/22/orig 2025-12-04T09:17:10.7823352Z * [new branch] gh/oulgen/23/base -> origin/gh/oulgen/23/base 2025-12-04T09:17:10.7827084Z * [new branch] gh/oulgen/23/head -> origin/gh/oulgen/23/head 2025-12-04T09:17:10.7828994Z * [new branch] gh/oulgen/23/orig -> origin/gh/oulgen/23/orig 2025-12-04T09:17:10.7831816Z * [new branch] gh/oulgen/24/base -> origin/gh/oulgen/24/base 2025-12-04T09:17:10.7833927Z * [new branch] gh/oulgen/24/head -> origin/gh/oulgen/24/head 2025-12-04T09:17:10.7836002Z * [new branch] gh/oulgen/24/orig -> origin/gh/oulgen/24/orig 2025-12-04T09:17:10.7838757Z * [new branch] gh/oulgen/25/base -> origin/gh/oulgen/25/base 2025-12-04T09:17:10.7840871Z * [new branch] gh/oulgen/25/head -> origin/gh/oulgen/25/head 2025-12-04T09:17:10.7842995Z * [new branch] gh/oulgen/25/orig -> origin/gh/oulgen/25/orig 2025-12-04T09:17:10.7845755Z * [new branch] gh/oulgen/26/base -> origin/gh/oulgen/26/base 2025-12-04T09:17:10.7847906Z * [new branch] gh/oulgen/26/head -> origin/gh/oulgen/26/head 2025-12-04T09:17:10.7850060Z * [new branch] gh/oulgen/26/orig -> origin/gh/oulgen/26/orig 2025-12-04T09:17:10.7852856Z * [new branch] gh/oulgen/4/base -> origin/gh/oulgen/4/base 2025-12-04T09:17:10.7854923Z * [new branch] gh/oulgen/4/head -> origin/gh/oulgen/4/head 2025-12-04T09:17:10.7857020Z * [new branch] gh/oulgen/4/orig -> origin/gh/oulgen/4/orig 2025-12-04T09:17:10.7860445Z * [new branch] gh/oulgen/7/base -> origin/gh/oulgen/7/base 2025-12-04T09:17:10.7862933Z * [new branch] gh/oulgen/7/head -> origin/gh/oulgen/7/head 2025-12-04T09:17:10.7864952Z * [new branch] gh/oulgen/7/orig -> origin/gh/oulgen/7/orig 2025-12-04T09:17:10.7867936Z * [new branch] gh/oulgen/8/base -> origin/gh/oulgen/8/base 2025-12-04T09:17:10.7870004Z * [new branch] gh/oulgen/8/head -> origin/gh/oulgen/8/head 2025-12-04T09:17:10.7872136Z * [new branch] gh/oulgen/8/orig -> origin/gh/oulgen/8/orig 2025-12-04T09:17:10.7874912Z * [new branch] gh/oulgen/9/base -> origin/gh/oulgen/9/base 2025-12-04T09:17:10.7876994Z * [new branch] gh/oulgen/9/head -> origin/gh/oulgen/9/head 2025-12-04T09:17:10.7879252Z * [new branch] gh/oulgen/9/orig -> origin/gh/oulgen/9/orig 2025-12-04T09:17:10.7881979Z * [new branch] gh/patvig/mtia-serialization -> origin/gh/patvig/mtia-serialization 2025-12-04T09:17:10.7885509Z * [new branch] gh/pearu/108/base -> origin/gh/pearu/108/base 2025-12-04T09:17:10.7887780Z * [new branch] gh/pearu/108/head -> origin/gh/pearu/108/head 2025-12-04T09:17:10.7890005Z * [new branch] gh/pearu/108/orig -> origin/gh/pearu/108/orig 2025-12-04T09:17:10.7892891Z * [new branch] gh/pearu/109/base -> origin/gh/pearu/109/base 2025-12-04T09:17:10.7894976Z * [new branch] gh/pearu/109/head -> origin/gh/pearu/109/head 2025-12-04T09:17:10.7897084Z * [new branch] gh/pearu/109/orig -> origin/gh/pearu/109/orig 2025-12-04T09:17:10.7899924Z * [new branch] gh/pearu/110/base -> origin/gh/pearu/110/base 2025-12-04T09:17:10.7902019Z * [new branch] gh/pearu/110/head -> origin/gh/pearu/110/head 2025-12-04T09:17:10.7904304Z * [new branch] gh/pearu/110/orig -> origin/gh/pearu/110/orig 2025-12-04T09:17:10.7907111Z * [new branch] gh/pearu/111/base -> origin/gh/pearu/111/base 2025-12-04T09:17:10.7909152Z * [new branch] gh/pearu/111/head -> origin/gh/pearu/111/head 2025-12-04T09:17:10.7911322Z * [new branch] gh/pearu/111/orig -> origin/gh/pearu/111/orig 2025-12-04T09:17:10.7914154Z * [new branch] gh/pearu/112/base -> origin/gh/pearu/112/base 2025-12-04T09:17:10.7916377Z * [new branch] gh/pearu/112/head -> origin/gh/pearu/112/head 2025-12-04T09:17:10.7918478Z * [new branch] gh/pearu/112/orig -> origin/gh/pearu/112/orig 2025-12-04T09:17:10.7921246Z * [new branch] gh/pearu/115/base -> origin/gh/pearu/115/base 2025-12-04T09:17:10.7923385Z * [new branch] gh/pearu/115/head -> origin/gh/pearu/115/head 2025-12-04T09:17:10.7925839Z * [new branch] gh/pearu/115/orig -> origin/gh/pearu/115/orig 2025-12-04T09:17:10.7929074Z * [new branch] gh/pearu/116/base -> origin/gh/pearu/116/base 2025-12-04T09:17:10.7931162Z * [new branch] gh/pearu/116/head -> origin/gh/pearu/116/head 2025-12-04T09:17:10.7933327Z * [new branch] gh/pearu/116/orig -> origin/gh/pearu/116/orig 2025-12-04T09:17:10.7936620Z * [new branch] gh/pearu/117/base -> origin/gh/pearu/117/base 2025-12-04T09:17:10.7938709Z * [new branch] gh/pearu/117/head -> origin/gh/pearu/117/head 2025-12-04T09:17:10.7940854Z * [new branch] gh/pearu/117/orig -> origin/gh/pearu/117/orig 2025-12-04T09:17:10.7943771Z * [new branch] gh/pearu/118/base -> origin/gh/pearu/118/base 2025-12-04T09:17:10.7945985Z * [new branch] gh/pearu/118/head -> origin/gh/pearu/118/head 2025-12-04T09:17:10.7948147Z * [new branch] gh/pearu/118/orig -> origin/gh/pearu/118/orig 2025-12-04T09:17:10.7950889Z * [new branch] gh/pearu/119/base -> origin/gh/pearu/119/base 2025-12-04T09:17:10.7952920Z * [new branch] gh/pearu/119/head -> origin/gh/pearu/119/head 2025-12-04T09:17:10.7955024Z * [new branch] gh/pearu/119/orig -> origin/gh/pearu/119/orig 2025-12-04T09:17:10.7957880Z * [new branch] gh/pearu/139/base -> origin/gh/pearu/139/base 2025-12-04T09:17:10.7960031Z * [new branch] gh/pearu/139/head -> origin/gh/pearu/139/head 2025-12-04T09:17:10.7962155Z * [new branch] gh/pearu/139/orig -> origin/gh/pearu/139/orig 2025-12-04T09:17:10.7964940Z * [new branch] gh/pearu/140/base -> origin/gh/pearu/140/base 2025-12-04T09:17:10.7967336Z * [new branch] gh/pearu/140/head -> origin/gh/pearu/140/head 2025-12-04T09:17:10.7969259Z * [new branch] gh/pearu/140/orig -> origin/gh/pearu/140/orig 2025-12-04T09:17:10.7972089Z * [new branch] gh/pearu/142/base -> origin/gh/pearu/142/base 2025-12-04T09:17:10.7974318Z * [new branch] gh/pearu/142/head -> origin/gh/pearu/142/head 2025-12-04T09:17:10.7976435Z * [new branch] gh/pearu/142/orig -> origin/gh/pearu/142/orig 2025-12-04T09:17:10.7979217Z * [new branch] gh/pearu/143/base -> origin/gh/pearu/143/base 2025-12-04T09:17:10.7981383Z * [new branch] gh/pearu/143/head -> origin/gh/pearu/143/head 2025-12-04T09:17:10.7983686Z * [new branch] gh/pearu/143/orig -> origin/gh/pearu/143/orig 2025-12-04T09:17:10.7986567Z * [new branch] gh/pearu/147/base -> origin/gh/pearu/147/base 2025-12-04T09:17:10.7988656Z * [new branch] gh/pearu/147/head -> origin/gh/pearu/147/head 2025-12-04T09:17:10.7990698Z * [new branch] gh/pearu/147/orig -> origin/gh/pearu/147/orig 2025-12-04T09:17:10.7994055Z * [new branch] gh/pearu/149/base -> origin/gh/pearu/149/base 2025-12-04T09:17:10.7996647Z * [new branch] gh/pearu/149/head -> origin/gh/pearu/149/head 2025-12-04T09:17:10.7998726Z * [new branch] gh/pearu/149/orig -> origin/gh/pearu/149/orig 2025-12-04T09:17:10.8002003Z * [new branch] gh/pearu/150/base -> origin/gh/pearu/150/base 2025-12-04T09:17:10.8004202Z * [new branch] gh/pearu/150/head -> origin/gh/pearu/150/head 2025-12-04T09:17:10.8006327Z * [new branch] gh/pearu/150/orig -> origin/gh/pearu/150/orig 2025-12-04T09:17:10.8009267Z * [new branch] gh/pearu/151/base -> origin/gh/pearu/151/base 2025-12-04T09:17:10.8011453Z * [new branch] gh/pearu/151/head -> origin/gh/pearu/151/head 2025-12-04T09:17:10.8013482Z * [new branch] gh/pearu/151/orig -> origin/gh/pearu/151/orig 2025-12-04T09:17:10.8016406Z * [new branch] gh/pearu/152/base -> origin/gh/pearu/152/base 2025-12-04T09:17:10.8018425Z * [new branch] gh/pearu/152/head -> origin/gh/pearu/152/head 2025-12-04T09:17:10.8021319Z * [new branch] gh/pearu/152/orig -> origin/gh/pearu/152/orig 2025-12-04T09:17:10.8024571Z * [new branch] gh/pearu/153/base -> origin/gh/pearu/153/base 2025-12-04T09:17:10.8028836Z * [new branch] gh/pearu/153/head -> origin/gh/pearu/153/head 2025-12-04T09:17:10.8030979Z * [new branch] gh/pearu/153/orig -> origin/gh/pearu/153/orig 2025-12-04T09:17:10.8034024Z * [new branch] gh/pearu/154/base -> origin/gh/pearu/154/base 2025-12-04T09:17:10.8036329Z * [new branch] gh/pearu/154/head -> origin/gh/pearu/154/head 2025-12-04T09:17:10.8038518Z * [new branch] gh/pearu/154/orig -> origin/gh/pearu/154/orig 2025-12-04T09:17:10.8041507Z * [new branch] gh/pearu/155/base -> origin/gh/pearu/155/base 2025-12-04T09:17:10.8043681Z * [new branch] gh/pearu/155/head -> origin/gh/pearu/155/head 2025-12-04T09:17:10.8046087Z * [new branch] gh/pearu/155/orig -> origin/gh/pearu/155/orig 2025-12-04T09:17:10.8048798Z * [new branch] gh/pearu/156/base -> origin/gh/pearu/156/base 2025-12-04T09:17:10.8051019Z * [new branch] gh/pearu/156/head -> origin/gh/pearu/156/head 2025-12-04T09:17:10.8053223Z * [new branch] gh/pearu/156/orig -> origin/gh/pearu/156/orig 2025-12-04T09:17:10.8056509Z * [new branch] gh/pearu/56/base -> origin/gh/pearu/56/base 2025-12-04T09:17:10.8059050Z * [new branch] gh/pearu/56/head -> origin/gh/pearu/56/head 2025-12-04T09:17:10.8061344Z * [new branch] gh/pearu/56/orig -> origin/gh/pearu/56/orig 2025-12-04T09:17:10.8064579Z * [new branch] gh/pearu/97/base -> origin/gh/pearu/97/base 2025-12-04T09:17:10.8067064Z * [new branch] gh/pearu/97/head -> origin/gh/pearu/97/head 2025-12-04T09:17:10.8069266Z * [new branch] gh/pearu/97/orig -> origin/gh/pearu/97/orig 2025-12-04T09:17:10.8072613Z * [new branch] gh/pianpwk/21/base -> origin/gh/pianpwk/21/base 2025-12-04T09:17:10.8074819Z * [new branch] gh/pianpwk/21/head -> origin/gh/pianpwk/21/head 2025-12-04T09:17:10.8077734Z * [new branch] gh/pianpwk/28/base -> origin/gh/pianpwk/28/base 2025-12-04T09:17:10.8079913Z * [new branch] gh/pianpwk/28/head -> origin/gh/pianpwk/28/head 2025-12-04T09:17:10.8082147Z * [new branch] gh/pianpwk/28/orig -> origin/gh/pianpwk/28/orig 2025-12-04T09:17:10.8085441Z * [new branch] gh/pianpwk/29/base -> origin/gh/pianpwk/29/base 2025-12-04T09:17:10.8088180Z * [new branch] gh/pianpwk/29/head -> origin/gh/pianpwk/29/head 2025-12-04T09:17:10.8090343Z * [new branch] gh/pianpwk/29/orig -> origin/gh/pianpwk/29/orig 2025-12-04T09:17:10.8093359Z * [new branch] gh/pianpwk/30/base -> origin/gh/pianpwk/30/base 2025-12-04T09:17:10.8095894Z * [new branch] gh/pianpwk/30/head -> origin/gh/pianpwk/30/head 2025-12-04T09:17:10.8098162Z * [new branch] gh/pianpwk/30/orig -> origin/gh/pianpwk/30/orig 2025-12-04T09:17:10.8100887Z * [new branch] gh/pianpwk/31/base -> origin/gh/pianpwk/31/base 2025-12-04T09:17:10.8103215Z * [new branch] gh/pianpwk/31/head -> origin/gh/pianpwk/31/head 2025-12-04T09:17:10.8105480Z * [new branch] gh/pianpwk/31/orig -> origin/gh/pianpwk/31/orig 2025-12-04T09:17:10.8108294Z * [new branch] gh/pianpwk/32/base -> origin/gh/pianpwk/32/base 2025-12-04T09:17:10.8110548Z * [new branch] gh/pianpwk/32/head -> origin/gh/pianpwk/32/head 2025-12-04T09:17:10.8112859Z * [new branch] gh/pianpwk/32/orig -> origin/gh/pianpwk/32/orig 2025-12-04T09:17:10.8115497Z * [new branch] gh/pianpwk/33/base -> origin/gh/pianpwk/33/base 2025-12-04T09:17:10.8117697Z * [new branch] gh/pianpwk/33/head -> origin/gh/pianpwk/33/head 2025-12-04T09:17:10.8119799Z * [new branch] gh/pianpwk/33/orig -> origin/gh/pianpwk/33/orig 2025-12-04T09:17:10.8122911Z * [new branch] gh/pianpwk/34/base -> origin/gh/pianpwk/34/base 2025-12-04T09:17:10.8125525Z * [new branch] gh/pianpwk/34/head -> origin/gh/pianpwk/34/head 2025-12-04T09:17:10.8128114Z * [new branch] gh/pianpwk/34/orig -> origin/gh/pianpwk/34/orig 2025-12-04T09:17:10.8131240Z * [new branch] gh/pianpwk/35/base -> origin/gh/pianpwk/35/base 2025-12-04T09:17:10.8133287Z * [new branch] gh/pianpwk/35/head -> origin/gh/pianpwk/35/head 2025-12-04T09:17:10.8135567Z * [new branch] gh/pianpwk/35/orig -> origin/gh/pianpwk/35/orig 2025-12-04T09:17:10.8138870Z * [new branch] gh/rec/141/base -> origin/gh/rec/141/base 2025-12-04T09:17:10.8141045Z * [new branch] gh/rec/141/head -> origin/gh/rec/141/head 2025-12-04T09:17:10.8144024Z * [new branch] gh/rec/153/base -> origin/gh/rec/153/base 2025-12-04T09:17:10.8146211Z * [new branch] gh/rec/153/head -> origin/gh/rec/153/head 2025-12-04T09:17:10.8148339Z * [new branch] gh/rec/153/orig -> origin/gh/rec/153/orig 2025-12-04T09:17:10.8151379Z * [new branch] gh/rec/154/base -> origin/gh/rec/154/base 2025-12-04T09:17:10.8153504Z * [new branch] gh/rec/154/head -> origin/gh/rec/154/head 2025-12-04T09:17:10.8155752Z * [new branch] gh/rec/154/orig -> origin/gh/rec/154/orig 2025-12-04T09:17:10.8158573Z * [new branch] gh/rec/164/base -> origin/gh/rec/164/base 2025-12-04T09:17:10.8160776Z * [new branch] gh/rec/164/head -> origin/gh/rec/164/head 2025-12-04T09:17:10.8163712Z * [new branch] gh/rec/164/orig -> origin/gh/rec/164/orig 2025-12-04T09:17:10.8166401Z * [new branch] gh/rec/166/base -> origin/gh/rec/166/base 2025-12-04T09:17:10.8168676Z * [new branch] gh/rec/166/head -> origin/gh/rec/166/head 2025-12-04T09:17:10.8171105Z * [new branch] gh/rec/166/orig -> origin/gh/rec/166/orig 2025-12-04T09:17:10.8173931Z * [new branch] gh/rec/167/base -> origin/gh/rec/167/base 2025-12-04T09:17:10.8176304Z * [new branch] gh/rec/167/head -> origin/gh/rec/167/head 2025-12-04T09:17:10.8178285Z * [new branch] gh/rec/167/orig -> origin/gh/rec/167/orig 2025-12-04T09:17:10.8181134Z * [new branch] gh/rec/168/base -> origin/gh/rec/168/base 2025-12-04T09:17:10.8183407Z * [new branch] gh/rec/168/head -> origin/gh/rec/168/head 2025-12-04T09:17:10.8185594Z * [new branch] gh/rec/168/orig -> origin/gh/rec/168/orig 2025-12-04T09:17:10.8188577Z * [new branch] gh/rec/169/base -> origin/gh/rec/169/base 2025-12-04T09:17:10.8190942Z * [new branch] gh/rec/169/head -> origin/gh/rec/169/head 2025-12-04T09:17:10.8193281Z * [new branch] gh/rec/169/orig -> origin/gh/rec/169/orig 2025-12-04T09:17:10.8196009Z * [new branch] gh/rec/170/base -> origin/gh/rec/170/base 2025-12-04T09:17:10.8198145Z * [new branch] gh/rec/170/head -> origin/gh/rec/170/head 2025-12-04T09:17:10.8200556Z * [new branch] gh/rec/170/orig -> origin/gh/rec/170/orig 2025-12-04T09:17:10.8203575Z * [new branch] gh/rec/171/base -> origin/gh/rec/171/base 2025-12-04T09:17:10.8205524Z * [new branch] gh/rec/171/head -> origin/gh/rec/171/head 2025-12-04T09:17:10.8207667Z * [new branch] gh/rec/171/orig -> origin/gh/rec/171/orig 2025-12-04T09:17:10.8210712Z * [new branch] gh/rec/172/base -> origin/gh/rec/172/base 2025-12-04T09:17:10.8212857Z * [new branch] gh/rec/172/head -> origin/gh/rec/172/head 2025-12-04T09:17:10.8215252Z * [new branch] gh/rec/172/orig -> origin/gh/rec/172/orig 2025-12-04T09:17:10.8218262Z * [new branch] gh/rec/173/base -> origin/gh/rec/173/base 2025-12-04T09:17:10.8220292Z * [new branch] gh/rec/173/head -> origin/gh/rec/173/head 2025-12-04T09:17:10.8222423Z * [new branch] gh/rec/173/orig -> origin/gh/rec/173/orig 2025-12-04T09:17:10.8225479Z * [new branch] gh/rec/174/base -> origin/gh/rec/174/base 2025-12-04T09:17:10.8227978Z * [new branch] gh/rec/174/head -> origin/gh/rec/174/head 2025-12-04T09:17:10.8230204Z * [new branch] gh/rec/174/orig -> origin/gh/rec/174/orig 2025-12-04T09:17:10.8233215Z * [new branch] gh/rec/175/base -> origin/gh/rec/175/base 2025-12-04T09:17:10.8235178Z * [new branch] gh/rec/175/head -> origin/gh/rec/175/head 2025-12-04T09:17:10.8237426Z * [new branch] gh/rec/175/orig -> origin/gh/rec/175/orig 2025-12-04T09:17:10.8240340Z * [new branch] gh/rec/176/base -> origin/gh/rec/176/base 2025-12-04T09:17:10.8242324Z * [new branch] gh/rec/176/head -> origin/gh/rec/176/head 2025-12-04T09:17:10.8244662Z * [new branch] gh/rec/176/orig -> origin/gh/rec/176/orig 2025-12-04T09:17:10.8247353Z * [new branch] gh/rec/177/base -> origin/gh/rec/177/base 2025-12-04T09:17:10.8249466Z * [new branch] gh/rec/177/head -> origin/gh/rec/177/head 2025-12-04T09:17:10.8252043Z * [new branch] gh/rec/177/orig -> origin/gh/rec/177/orig 2025-12-04T09:17:10.8263614Z * [new branch] gh/robert-hardwick/3/base -> origin/gh/robert-hardwick/3/base 2025-12-04T09:17:10.8264304Z * [new branch] gh/robert-hardwick/3/head -> origin/gh/robert-hardwick/3/head 2025-12-04T09:17:10.8264908Z * [new branch] gh/robert-hardwick/3/orig -> origin/gh/robert-hardwick/3/orig 2025-12-04T09:17:10.8265502Z * [new branch] gh/robert-hardwick/4/base -> origin/gh/robert-hardwick/4/base 2025-12-04T09:17:10.8266091Z * [new branch] gh/robert-hardwick/4/head -> origin/gh/robert-hardwick/4/head 2025-12-04T09:17:10.8266682Z * [new branch] gh/robert-hardwick/4/orig -> origin/gh/robert-hardwick/4/orig 2025-12-04T09:17:10.8269909Z * [new branch] gh/robert-hardwick/5/base -> origin/gh/robert-hardwick/5/base 2025-12-04T09:17:10.8271811Z * [new branch] gh/robert-hardwick/5/head -> origin/gh/robert-hardwick/5/head 2025-12-04T09:17:10.8274556Z * [new branch] gh/robert-hardwick/5/orig -> origin/gh/robert-hardwick/5/orig 2025-12-04T09:17:10.8277325Z * [new branch] gh/robert-hardwick/6/base -> origin/gh/robert-hardwick/6/base 2025-12-04T09:17:10.8279407Z * [new branch] gh/robert-hardwick/6/head -> origin/gh/robert-hardwick/6/head 2025-12-04T09:17:10.8281504Z * [new branch] gh/robert-hardwick/6/orig -> origin/gh/robert-hardwick/6/orig 2025-12-04T09:17:10.8284293Z * [new branch] gh/robert-hardwick/7/base -> origin/gh/robert-hardwick/7/base 2025-12-04T09:17:10.8286392Z * [new branch] gh/robert-hardwick/7/head -> origin/gh/robert-hardwick/7/head 2025-12-04T09:17:10.8288557Z * [new branch] gh/robert-hardwick/7/orig -> origin/gh/robert-hardwick/7/orig 2025-12-04T09:17:10.8291324Z * [new branch] gh/robert-hardwick/8/base -> origin/gh/robert-hardwick/8/base 2025-12-04T09:17:10.8293430Z * [new branch] gh/robert-hardwick/8/head -> origin/gh/robert-hardwick/8/head 2025-12-04T09:17:10.8295577Z * [new branch] gh/robert-hardwick/8/orig -> origin/gh/robert-hardwick/8/orig 2025-12-04T09:17:10.8298451Z * [new branch] gh/robert-hardwick/9/base -> origin/gh/robert-hardwick/9/base 2025-12-04T09:17:10.8300518Z * [new branch] gh/robert-hardwick/9/head -> origin/gh/robert-hardwick/9/head 2025-12-04T09:17:10.8302762Z * [new branch] gh/robert-hardwick/9/orig -> origin/gh/robert-hardwick/9/orig 2025-12-04T09:17:10.8306614Z * [new branch] gh/rtimpe/1/base -> origin/gh/rtimpe/1/base 2025-12-04T09:17:10.8308714Z * [new branch] gh/rtimpe/1/head -> origin/gh/rtimpe/1/head 2025-12-04T09:17:10.8312008Z * [new branch] gh/rtimpe/2/base -> origin/gh/rtimpe/2/base 2025-12-04T09:17:10.8314079Z * [new branch] gh/rtimpe/2/head -> origin/gh/rtimpe/2/head 2025-12-04T09:17:10.8317017Z * [new branch] gh/rtimpe/22/base -> origin/gh/rtimpe/22/base 2025-12-04T09:17:10.8319109Z * [new branch] gh/rtimpe/22/head -> origin/gh/rtimpe/22/head 2025-12-04T09:17:10.8321306Z * [new branch] gh/rtimpe/22/orig -> origin/gh/rtimpe/22/orig 2025-12-04T09:17:10.8324013Z * [new branch] gh/rtimpe/23/base -> origin/gh/rtimpe/23/base 2025-12-04T09:17:10.8326641Z * [new branch] gh/rtimpe/23/head -> origin/gh/rtimpe/23/head 2025-12-04T09:17:10.8328543Z * [new branch] gh/rtimpe/23/orig -> origin/gh/rtimpe/23/orig 2025-12-04T09:17:10.8331287Z * [new branch] gh/rtimpe/24/base -> origin/gh/rtimpe/24/base 2025-12-04T09:17:10.8333383Z * [new branch] gh/rtimpe/24/head -> origin/gh/rtimpe/24/head 2025-12-04T09:17:10.8335473Z * [new branch] gh/rtimpe/24/orig -> origin/gh/rtimpe/24/orig 2025-12-04T09:17:10.8338251Z * [new branch] gh/rtimpe/25/base -> origin/gh/rtimpe/25/base 2025-12-04T09:17:10.8340454Z * [new branch] gh/rtimpe/25/head -> origin/gh/rtimpe/25/head 2025-12-04T09:17:10.8342586Z * [new branch] gh/rtimpe/25/orig -> origin/gh/rtimpe/25/orig 2025-12-04T09:17:10.8345484Z * [new branch] gh/rtimpe/26/base -> origin/gh/rtimpe/26/base 2025-12-04T09:17:10.8347606Z * [new branch] gh/rtimpe/26/head -> origin/gh/rtimpe/26/head 2025-12-04T09:17:10.8349772Z * [new branch] gh/rtimpe/26/orig -> origin/gh/rtimpe/26/orig 2025-12-04T09:17:10.8352452Z * [new branch] gh/rtimpe/27/base -> origin/gh/rtimpe/27/base 2025-12-04T09:17:10.8354642Z * [new branch] gh/rtimpe/27/head -> origin/gh/rtimpe/27/head 2025-12-04T09:17:10.8357331Z * [new branch] gh/rtimpe/27/orig -> origin/gh/rtimpe/27/orig 2025-12-04T09:17:10.8360175Z * [new branch] gh/rtimpe/28/base -> origin/gh/rtimpe/28/base 2025-12-04T09:17:10.8362210Z * [new branch] gh/rtimpe/28/head -> origin/gh/rtimpe/28/head 2025-12-04T09:17:10.8364513Z * [new branch] gh/rtimpe/28/orig -> origin/gh/rtimpe/28/orig 2025-12-04T09:17:10.8367272Z * [new branch] gh/rtimpe/29/base -> origin/gh/rtimpe/29/base 2025-12-04T09:17:10.8369416Z * [new branch] gh/rtimpe/29/head -> origin/gh/rtimpe/29/head 2025-12-04T09:17:10.8371505Z * [new branch] gh/rtimpe/29/orig -> origin/gh/rtimpe/29/orig 2025-12-04T09:17:10.8374235Z * [new branch] gh/rtimpe/3/base -> origin/gh/rtimpe/3/base 2025-12-04T09:17:10.8376308Z * [new branch] gh/rtimpe/3/head -> origin/gh/rtimpe/3/head 2025-12-04T09:17:10.8379066Z * [new branch] gh/rtimpe/30/base -> origin/gh/rtimpe/30/base 2025-12-04T09:17:10.8381119Z * [new branch] gh/rtimpe/30/head -> origin/gh/rtimpe/30/head 2025-12-04T09:17:10.8383379Z * [new branch] gh/rtimpe/30/orig -> origin/gh/rtimpe/30/orig 2025-12-04T09:17:10.8386190Z * [new branch] gh/rtimpe/31/base -> origin/gh/rtimpe/31/base 2025-12-04T09:17:10.8388347Z * [new branch] gh/rtimpe/31/head -> origin/gh/rtimpe/31/head 2025-12-04T09:17:10.8390533Z * [new branch] gh/rtimpe/31/orig -> origin/gh/rtimpe/31/orig 2025-12-04T09:17:10.8393264Z * [new branch] gh/rtimpe/32/base -> origin/gh/rtimpe/32/base 2025-12-04T09:17:10.8395309Z * [new branch] gh/rtimpe/32/head -> origin/gh/rtimpe/32/head 2025-12-04T09:17:10.8397447Z * [new branch] gh/rtimpe/32/orig -> origin/gh/rtimpe/32/orig 2025-12-04T09:17:10.8400168Z * [new branch] gh/rtimpe/33/base -> origin/gh/rtimpe/33/base 2025-12-04T09:17:10.8402214Z * [new branch] gh/rtimpe/33/head -> origin/gh/rtimpe/33/head 2025-12-04T09:17:10.8404330Z * [new branch] gh/rtimpe/33/orig -> origin/gh/rtimpe/33/orig 2025-12-04T09:17:10.8407111Z * [new branch] gh/rtimpe/34/base -> origin/gh/rtimpe/34/base 2025-12-04T09:17:10.8409174Z * [new branch] gh/rtimpe/34/head -> origin/gh/rtimpe/34/head 2025-12-04T09:17:10.8411487Z * [new branch] gh/rtimpe/34/orig -> origin/gh/rtimpe/34/orig 2025-12-04T09:17:10.8414147Z * [new branch] gh/rtimpe/35/base -> origin/gh/rtimpe/35/base 2025-12-04T09:17:10.8416271Z * [new branch] gh/rtimpe/35/head -> origin/gh/rtimpe/35/head 2025-12-04T09:17:10.8418420Z * [new branch] gh/rtimpe/35/orig -> origin/gh/rtimpe/35/orig 2025-12-04T09:17:10.8421166Z * [new branch] gh/rtimpe/4/base -> origin/gh/rtimpe/4/base 2025-12-04T09:17:10.8423298Z * [new branch] gh/rtimpe/4/head -> origin/gh/rtimpe/4/head 2025-12-04T09:17:10.8429300Z * [new branch] gh/ruisizhang123/1/base -> origin/gh/ruisizhang123/1/base 2025-12-04T09:17:10.8431228Z * [new branch] gh/ruisizhang123/1/head -> origin/gh/ruisizhang123/1/head 2025-12-04T09:17:10.8433351Z * [new branch] gh/ruisizhang123/1/orig -> origin/gh/ruisizhang123/1/orig 2025-12-04T09:17:10.8436186Z * [new branch] gh/ruisizhang123/4/base -> origin/gh/ruisizhang123/4/base 2025-12-04T09:17:10.8438326Z * [new branch] gh/ruisizhang123/4/head -> origin/gh/ruisizhang123/4/head 2025-12-04T09:17:10.8440616Z * [new branch] gh/ruisizhang123/4/orig -> origin/gh/ruisizhang123/4/orig 2025-12-04T09:17:10.8443429Z * [new branch] gh/ruisizhang123/5/base -> origin/gh/ruisizhang123/5/base 2025-12-04T09:17:10.8445634Z * [new branch] gh/ruisizhang123/5/head -> origin/gh/ruisizhang123/5/head 2025-12-04T09:17:10.8447774Z * [new branch] gh/ruisizhang123/5/orig -> origin/gh/ruisizhang123/5/orig 2025-12-04T09:17:10.8450541Z * [new branch] gh/ruisizhang123/6/base -> origin/gh/ruisizhang123/6/base 2025-12-04T09:17:10.8452619Z * [new branch] gh/ruisizhang123/6/head -> origin/gh/ruisizhang123/6/head 2025-12-04T09:17:10.8454732Z * [new branch] gh/ruisizhang123/6/orig -> origin/gh/ruisizhang123/6/orig 2025-12-04T09:17:10.8457690Z * [new branch] gh/ruisizhang123/7/base -> origin/gh/ruisizhang123/7/base 2025-12-04T09:17:10.8459770Z * [new branch] gh/ruisizhang123/7/head -> origin/gh/ruisizhang123/7/head 2025-12-04T09:17:10.8461864Z * [new branch] gh/ruisizhang123/7/orig -> origin/gh/ruisizhang123/7/orig 2025-12-04T09:17:10.8465155Z * [new branch] gh/ruisizhang123/8/base -> origin/gh/ruisizhang123/8/base 2025-12-04T09:17:10.8467230Z * [new branch] gh/ruisizhang123/8/head -> origin/gh/ruisizhang123/8/head 2025-12-04T09:17:10.8469429Z * [new branch] gh/ruisizhang123/8/orig -> origin/gh/ruisizhang123/8/orig 2025-12-04T09:17:10.8472222Z * [new branch] gh/ruisizhang123/9/base -> origin/gh/ruisizhang123/9/base 2025-12-04T09:17:10.8474354Z * [new branch] gh/ruisizhang123/9/head -> origin/gh/ruisizhang123/9/head 2025-12-04T09:17:10.8476419Z * [new branch] gh/ruisizhang123/9/orig -> origin/gh/ruisizhang123/9/orig 2025-12-04T09:17:10.8479832Z * [new branch] gh/seemethere/52/base -> origin/gh/seemethere/52/base 2025-12-04T09:17:10.8481917Z * [new branch] gh/seemethere/52/head -> origin/gh/seemethere/52/head 2025-12-04T09:17:10.8484105Z * [new branch] gh/seemethere/52/orig -> origin/gh/seemethere/52/orig 2025-12-04T09:17:10.8486794Z * [new branch] gh/seemethere/53/base -> origin/gh/seemethere/53/base 2025-12-04T09:17:10.8488865Z * [new branch] gh/seemethere/53/head -> origin/gh/seemethere/53/head 2025-12-04T09:17:10.8490988Z * [new branch] gh/seemethere/53/orig -> origin/gh/seemethere/53/orig 2025-12-04T09:17:10.8493783Z * [new branch] gh/seemethere/54/base -> origin/gh/seemethere/54/base 2025-12-04T09:17:10.8495954Z * [new branch] gh/seemethere/54/head -> origin/gh/seemethere/54/head 2025-12-04T09:17:10.8498187Z * [new branch] gh/seemethere/54/orig -> origin/gh/seemethere/54/orig 2025-12-04T09:17:10.8500770Z * [new branch] gh/seemethere/55/base -> origin/gh/seemethere/55/base 2025-12-04T09:17:10.8502830Z * [new branch] gh/seemethere/55/head -> origin/gh/seemethere/55/head 2025-12-04T09:17:10.8505027Z * [new branch] gh/seemethere/55/orig -> origin/gh/seemethere/55/orig 2025-12-04T09:17:10.8507741Z * [new branch] gh/seemethere/59/base -> origin/gh/seemethere/59/base 2025-12-04T09:17:10.8509841Z * [new branch] gh/seemethere/59/head -> origin/gh/seemethere/59/head 2025-12-04T09:17:10.8511923Z * [new branch] gh/seemethere/59/orig -> origin/gh/seemethere/59/orig 2025-12-04T09:17:10.8514725Z * [new branch] gh/seemethere/62/base -> origin/gh/seemethere/62/base 2025-12-04T09:17:10.8516816Z * [new branch] gh/seemethere/62/head -> origin/gh/seemethere/62/head 2025-12-04T09:17:10.8519456Z * [new branch] gh/seemethere/62/orig -> origin/gh/seemethere/62/orig 2025-12-04T09:17:10.8522190Z * [new branch] gh/seemethere/63/base -> origin/gh/seemethere/63/base 2025-12-04T09:17:10.8524595Z * [new branch] gh/seemethere/63/head -> origin/gh/seemethere/63/head 2025-12-04T09:17:10.8526886Z * [new branch] gh/seemethere/63/orig -> origin/gh/seemethere/63/orig 2025-12-04T09:17:10.8530224Z * [new branch] gh/seemethere/71/base -> origin/gh/seemethere/71/base 2025-12-04T09:17:10.8532332Z * [new branch] gh/seemethere/71/head -> origin/gh/seemethere/71/head 2025-12-04T09:17:10.8534457Z * [new branch] gh/seemethere/71/orig -> origin/gh/seemethere/71/orig 2025-12-04T09:17:10.8537303Z * [new branch] gh/seemethere/72/base -> origin/gh/seemethere/72/base 2025-12-04T09:17:10.8539401Z * [new branch] gh/seemethere/72/head -> origin/gh/seemethere/72/head 2025-12-04T09:17:10.8541574Z * [new branch] gh/seemethere/72/orig -> origin/gh/seemethere/72/orig 2025-12-04T09:17:10.8544853Z * [new branch] gh/seemethere/73/base -> origin/gh/seemethere/73/base 2025-12-04T09:17:10.8547146Z * [new branch] gh/seemethere/73/head -> origin/gh/seemethere/73/head 2025-12-04T09:17:10.8549022Z * [new branch] gh/seemethere/73/orig -> origin/gh/seemethere/73/orig 2025-12-04T09:17:10.8552336Z * [new branch] gh/seemethere/74/base -> origin/gh/seemethere/74/base 2025-12-04T09:17:10.8554526Z * [new branch] gh/seemethere/74/head -> origin/gh/seemethere/74/head 2025-12-04T09:17:10.8556621Z * [new branch] gh/seemethere/74/orig -> origin/gh/seemethere/74/orig 2025-12-04T09:17:10.8559457Z * [new branch] gh/seemethere/75/base -> origin/gh/seemethere/75/base 2025-12-04T09:17:10.8561566Z * [new branch] gh/seemethere/75/head -> origin/gh/seemethere/75/head 2025-12-04T09:17:10.8563641Z * [new branch] gh/seemethere/75/orig -> origin/gh/seemethere/75/orig 2025-12-04T09:17:10.8566533Z * [new branch] gh/seemethere/76/base -> origin/gh/seemethere/76/base 2025-12-04T09:17:10.8568654Z * [new branch] gh/seemethere/76/head -> origin/gh/seemethere/76/head 2025-12-04T09:17:10.8570804Z * [new branch] gh/seemethere/76/orig -> origin/gh/seemethere/76/orig 2025-12-04T09:17:10.8574384Z * [new branch] gh/shunting314/145/base -> origin/gh/shunting314/145/base 2025-12-04T09:17:10.8576679Z * [new branch] gh/shunting314/145/head -> origin/gh/shunting314/145/head 2025-12-04T09:17:10.8578927Z * [new branch] gh/shunting314/145/orig -> origin/gh/shunting314/145/orig 2025-12-04T09:17:10.8582162Z * [new branch] gh/shunting314/176/base -> origin/gh/shunting314/176/base 2025-12-04T09:17:10.8584567Z * [new branch] gh/shunting314/176/head -> origin/gh/shunting314/176/head 2025-12-04T09:17:10.8586714Z * [new branch] gh/shunting314/176/orig -> origin/gh/shunting314/176/orig 2025-12-04T09:17:10.8589622Z * [new branch] gh/shunting314/249/base -> origin/gh/shunting314/249/base 2025-12-04T09:17:10.8591738Z * [new branch] gh/shunting314/249/head -> origin/gh/shunting314/249/head 2025-12-04T09:17:10.8593926Z * [new branch] gh/shunting314/249/orig -> origin/gh/shunting314/249/orig 2025-12-04T09:17:10.8596867Z * [new branch] gh/shunting314/253/base -> origin/gh/shunting314/253/base 2025-12-04T09:17:10.8598899Z * [new branch] gh/shunting314/253/head -> origin/gh/shunting314/253/head 2025-12-04T09:17:10.8600929Z * [new branch] gh/shunting314/253/orig -> origin/gh/shunting314/253/orig 2025-12-04T09:17:10.8603742Z * [new branch] gh/shunting314/256/base -> origin/gh/shunting314/256/base 2025-12-04T09:17:10.8605832Z * [new branch] gh/shunting314/256/head -> origin/gh/shunting314/256/head 2025-12-04T09:17:10.8607987Z * [new branch] gh/shunting314/256/orig -> origin/gh/shunting314/256/orig 2025-12-04T09:17:10.8611234Z * [new branch] gh/shunting314/257/base -> origin/gh/shunting314/257/base 2025-12-04T09:17:10.8613295Z * [new branch] gh/shunting314/257/head -> origin/gh/shunting314/257/head 2025-12-04T09:17:10.8615759Z * [new branch] gh/shunting314/257/orig -> origin/gh/shunting314/257/orig 2025-12-04T09:17:10.8618536Z * [new branch] gh/shunting314/258/base -> origin/gh/shunting314/258/base 2025-12-04T09:17:10.8620561Z * [new branch] gh/shunting314/258/head -> origin/gh/shunting314/258/head 2025-12-04T09:17:10.8622782Z * [new branch] gh/shunting314/258/orig -> origin/gh/shunting314/258/orig 2025-12-04T09:17:10.8625680Z * [new branch] gh/shunting314/259/base -> origin/gh/shunting314/259/base 2025-12-04T09:17:10.8627841Z * [new branch] gh/shunting314/259/head -> origin/gh/shunting314/259/head 2025-12-04T09:17:10.8629927Z * [new branch] gh/shunting314/259/orig -> origin/gh/shunting314/259/orig 2025-12-04T09:17:10.8632789Z * [new branch] gh/shunting314/260/base -> origin/gh/shunting314/260/base 2025-12-04T09:17:10.8635054Z * [new branch] gh/shunting314/260/head -> origin/gh/shunting314/260/head 2025-12-04T09:17:10.8637188Z * [new branch] gh/shunting314/260/orig -> origin/gh/shunting314/260/orig 2025-12-04T09:17:10.8640169Z * [new branch] gh/shunting314/261/base -> origin/gh/shunting314/261/base 2025-12-04T09:17:10.8642351Z * [new branch] gh/shunting314/261/head -> origin/gh/shunting314/261/head 2025-12-04T09:17:10.8644518Z * [new branch] gh/shunting314/261/orig -> origin/gh/shunting314/261/orig 2025-12-04T09:17:10.8647396Z * [new branch] gh/shunting314/262/base -> origin/gh/shunting314/262/base 2025-12-04T09:17:10.8649595Z * [new branch] gh/shunting314/262/head -> origin/gh/shunting314/262/head 2025-12-04T09:17:10.8651757Z * [new branch] gh/shunting314/262/orig -> origin/gh/shunting314/262/orig 2025-12-04T09:17:10.8654636Z * [new branch] gh/shunting314/263/base -> origin/gh/shunting314/263/base 2025-12-04T09:17:10.8656918Z * [new branch] gh/shunting314/263/head -> origin/gh/shunting314/263/head 2025-12-04T09:17:10.8659076Z * [new branch] gh/shunting314/263/orig -> origin/gh/shunting314/263/orig 2025-12-04T09:17:10.8661937Z * [new branch] gh/shunting314/264/base -> origin/gh/shunting314/264/base 2025-12-04T09:17:10.8664405Z * [new branch] gh/shunting314/264/head -> origin/gh/shunting314/264/head 2025-12-04T09:17:10.8666331Z * [new branch] gh/shunting314/264/orig -> origin/gh/shunting314/264/orig 2025-12-04T09:17:10.8669344Z * [new branch] gh/shunting314/265/base -> origin/gh/shunting314/265/base 2025-12-04T09:17:10.8671388Z * [new branch] gh/shunting314/265/head -> origin/gh/shunting314/265/head 2025-12-04T09:17:10.8673488Z * [new branch] gh/shunting314/265/orig -> origin/gh/shunting314/265/orig 2025-12-04T09:17:10.8676301Z * [new branch] gh/shunting314/266/base -> origin/gh/shunting314/266/base 2025-12-04T09:17:10.8678605Z * [new branch] gh/shunting314/266/head -> origin/gh/shunting314/266/head 2025-12-04T09:17:10.8680741Z * [new branch] gh/shunting314/266/orig -> origin/gh/shunting314/266/orig 2025-12-04T09:17:10.8683809Z * [new branch] gh/shunting314/267/base -> origin/gh/shunting314/267/base 2025-12-04T09:17:10.8686029Z * [new branch] gh/shunting314/267/head -> origin/gh/shunting314/267/head 2025-12-04T09:17:10.8688676Z * [new branch] gh/shunting314/267/orig -> origin/gh/shunting314/267/orig 2025-12-04T09:17:10.8692013Z * [new branch] gh/shunting314/268/base -> origin/gh/shunting314/268/base 2025-12-04T09:17:10.8694225Z * [new branch] gh/shunting314/268/head -> origin/gh/shunting314/268/head 2025-12-04T09:17:10.8696352Z * [new branch] gh/shunting314/268/orig -> origin/gh/shunting314/268/orig 2025-12-04T09:17:10.8699440Z * [new branch] gh/shunting314/269/base -> origin/gh/shunting314/269/base 2025-12-04T09:17:10.8702292Z * [new branch] gh/shunting314/269/head -> origin/gh/shunting314/269/head 2025-12-04T09:17:10.8704612Z * [new branch] gh/shunting314/269/orig -> origin/gh/shunting314/269/orig 2025-12-04T09:17:10.8707955Z * [new branch] gh/silverguo/1/base -> origin/gh/silverguo/1/base 2025-12-04T09:17:10.8710041Z * [new branch] gh/silverguo/1/head -> origin/gh/silverguo/1/head 2025-12-04T09:17:10.8712696Z * [new branch] gh/silverguo/2/base -> origin/gh/silverguo/2/base 2025-12-04T09:17:10.8714737Z * [new branch] gh/silverguo/2/head -> origin/gh/silverguo/2/head 2025-12-04T09:17:10.8717384Z * [new branch] gh/silverguo/3/base -> origin/gh/silverguo/3/base 2025-12-04T09:17:10.8719507Z * [new branch] gh/silverguo/3/head -> origin/gh/silverguo/3/head 2025-12-04T09:17:10.8722185Z * [new branch] gh/silverguo/4/base -> origin/gh/silverguo/4/base 2025-12-04T09:17:10.8724452Z * [new branch] gh/silverguo/4/head -> origin/gh/silverguo/4/head 2025-12-04T09:17:10.8727989Z * [new branch] gh/slayton58/39/base -> origin/gh/slayton58/39/base 2025-12-04T09:17:10.8730104Z * [new branch] gh/slayton58/39/head -> origin/gh/slayton58/39/head 2025-12-04T09:17:10.8732303Z * [new branch] gh/slayton58/39/orig -> origin/gh/slayton58/39/orig 2025-12-04T09:17:10.8735142Z * [new branch] gh/slayton58/42/base -> origin/gh/slayton58/42/base 2025-12-04T09:17:10.8737244Z * [new branch] gh/slayton58/42/head -> origin/gh/slayton58/42/head 2025-12-04T09:17:10.8739409Z * [new branch] gh/slayton58/42/orig -> origin/gh/slayton58/42/orig 2025-12-04T09:17:10.8742180Z * [new branch] gh/slayton58/43/base -> origin/gh/slayton58/43/base 2025-12-04T09:17:10.8744505Z * [new branch] gh/slayton58/43/head -> origin/gh/slayton58/43/head 2025-12-04T09:17:10.8746635Z * [new branch] gh/slayton58/43/orig -> origin/gh/slayton58/43/orig 2025-12-04T09:17:10.8749578Z * [new branch] gh/slayton58/44/base -> origin/gh/slayton58/44/base 2025-12-04T09:17:10.8751832Z * [new branch] gh/slayton58/44/head -> origin/gh/slayton58/44/head 2025-12-04T09:17:10.8753957Z * [new branch] gh/slayton58/44/orig -> origin/gh/slayton58/44/orig 2025-12-04T09:17:10.8756672Z * [new branch] gh/slayton58/45/base -> origin/gh/slayton58/45/base 2025-12-04T09:17:10.8758787Z * [new branch] gh/slayton58/45/head -> origin/gh/slayton58/45/head 2025-12-04T09:17:10.8760967Z * [new branch] gh/slayton58/45/orig -> origin/gh/slayton58/45/orig 2025-12-04T09:17:10.8763777Z * [new branch] gh/slayton58/46/base -> origin/gh/slayton58/46/base 2025-12-04T09:17:10.8766044Z * [new branch] gh/slayton58/46/head -> origin/gh/slayton58/46/head 2025-12-04T09:17:10.8768151Z * [new branch] gh/slayton58/46/orig -> origin/gh/slayton58/46/orig 2025-12-04T09:17:10.8771006Z * [new branch] gh/slayton58/6/base -> origin/gh/slayton58/6/base 2025-12-04T09:17:10.8773151Z * [new branch] gh/slayton58/6/head -> origin/gh/slayton58/6/head 2025-12-04T09:17:10.8775799Z * [new branch] gh/slayton58/7/base -> origin/gh/slayton58/7/base 2025-12-04T09:17:10.8777798Z * [new branch] gh/slayton58/7/head -> origin/gh/slayton58/7/head 2025-12-04T09:17:10.8781996Z * [new branch] gh/soulitzer/269/base -> origin/gh/soulitzer/269/base 2025-12-04T09:17:10.8784631Z * [new branch] gh/soulitzer/269/head -> origin/gh/soulitzer/269/head 2025-12-04T09:17:10.8786799Z * [new branch] gh/soulitzer/269/orig -> origin/gh/soulitzer/269/orig 2025-12-04T09:17:10.8789685Z * [new branch] gh/soulitzer/276/base -> origin/gh/soulitzer/276/base 2025-12-04T09:17:10.8791789Z * [new branch] gh/soulitzer/276/head -> origin/gh/soulitzer/276/head 2025-12-04T09:17:10.8793843Z * [new branch] gh/soulitzer/276/orig -> origin/gh/soulitzer/276/orig 2025-12-04T09:17:10.8797014Z * [new branch] gh/soulitzer/287/base -> origin/gh/soulitzer/287/base 2025-12-04T09:17:10.8799136Z * [new branch] gh/soulitzer/287/head -> origin/gh/soulitzer/287/head 2025-12-04T09:17:10.8801288Z * [new branch] gh/soulitzer/287/orig -> origin/gh/soulitzer/287/orig 2025-12-04T09:17:10.8804228Z * [new branch] gh/soulitzer/296/base -> origin/gh/soulitzer/296/base 2025-12-04T09:17:10.8806384Z * [new branch] gh/soulitzer/296/head -> origin/gh/soulitzer/296/head 2025-12-04T09:17:10.8808634Z * [new branch] gh/soulitzer/296/orig -> origin/gh/soulitzer/296/orig 2025-12-04T09:17:10.8811519Z * [new branch] gh/soulitzer/299/base -> origin/gh/soulitzer/299/base 2025-12-04T09:17:10.8813723Z * [new branch] gh/soulitzer/299/head -> origin/gh/soulitzer/299/head 2025-12-04T09:17:10.8815836Z * [new branch] gh/soulitzer/299/orig -> origin/gh/soulitzer/299/orig 2025-12-04T09:17:10.8818656Z * [new branch] gh/soulitzer/300/base -> origin/gh/soulitzer/300/base 2025-12-04T09:17:10.8820811Z * [new branch] gh/soulitzer/300/head -> origin/gh/soulitzer/300/head 2025-12-04T09:17:10.8822943Z * [new branch] gh/soulitzer/300/orig -> origin/gh/soulitzer/300/orig 2025-12-04T09:17:10.8827332Z * [new branch] gh/soulitzer/301/base -> origin/gh/soulitzer/301/base 2025-12-04T09:17:10.8829510Z * [new branch] gh/soulitzer/301/head -> origin/gh/soulitzer/301/head 2025-12-04T09:17:10.8831587Z * [new branch] gh/soulitzer/301/orig -> origin/gh/soulitzer/301/orig 2025-12-04T09:17:10.8834361Z * [new branch] gh/soulitzer/313/base -> origin/gh/soulitzer/313/base 2025-12-04T09:17:10.8836511Z * [new branch] gh/soulitzer/313/head -> origin/gh/soulitzer/313/head 2025-12-04T09:17:10.8838772Z * [new branch] gh/soulitzer/313/orig -> origin/gh/soulitzer/313/orig 2025-12-04T09:17:10.8841846Z * [new branch] gh/soulitzer/319/base -> origin/gh/soulitzer/319/base 2025-12-04T09:17:10.8843925Z * [new branch] gh/soulitzer/319/head -> origin/gh/soulitzer/319/head 2025-12-04T09:17:10.8846031Z * [new branch] gh/soulitzer/319/orig -> origin/gh/soulitzer/319/orig 2025-12-04T09:17:10.8848991Z * [new branch] gh/soulitzer/320/base -> origin/gh/soulitzer/320/base 2025-12-04T09:17:10.8851073Z * [new branch] gh/soulitzer/320/head -> origin/gh/soulitzer/320/head 2025-12-04T09:17:10.8853137Z * [new branch] gh/soulitzer/320/orig -> origin/gh/soulitzer/320/orig 2025-12-04T09:17:10.8856097Z * [new branch] gh/soulitzer/336/base -> origin/gh/soulitzer/336/base 2025-12-04T09:17:10.8858100Z * [new branch] gh/soulitzer/336/head -> origin/gh/soulitzer/336/head 2025-12-04T09:17:10.8860244Z * [new branch] gh/soulitzer/336/orig -> origin/gh/soulitzer/336/orig 2025-12-04T09:17:10.8863153Z * [new branch] gh/soulitzer/347/base -> origin/gh/soulitzer/347/base 2025-12-04T09:17:10.8865545Z * [new branch] gh/soulitzer/347/head -> origin/gh/soulitzer/347/head 2025-12-04T09:17:10.8867594Z * [new branch] gh/soulitzer/347/orig -> origin/gh/soulitzer/347/orig 2025-12-04T09:17:10.8870713Z * [new branch] gh/soulitzer/349/base -> origin/gh/soulitzer/349/base 2025-12-04T09:17:10.8872847Z * [new branch] gh/soulitzer/349/head -> origin/gh/soulitzer/349/head 2025-12-04T09:17:10.8875042Z * [new branch] gh/soulitzer/349/orig -> origin/gh/soulitzer/349/orig 2025-12-04T09:17:10.8877821Z * [new branch] gh/soulitzer/350/base -> origin/gh/soulitzer/350/base 2025-12-04T09:17:10.8879851Z * [new branch] gh/soulitzer/350/head -> origin/gh/soulitzer/350/head 2025-12-04T09:17:10.8881847Z * [new branch] gh/soulitzer/350/orig -> origin/gh/soulitzer/350/orig 2025-12-04T09:17:10.8884756Z * [new branch] gh/soulitzer/351/base -> origin/gh/soulitzer/351/base 2025-12-04T09:17:10.8886894Z * [new branch] gh/soulitzer/351/head -> origin/gh/soulitzer/351/head 2025-12-04T09:17:10.8889016Z * [new branch] gh/soulitzer/351/orig -> origin/gh/soulitzer/351/orig 2025-12-04T09:17:10.8891838Z * [new branch] gh/soulitzer/353/base -> origin/gh/soulitzer/353/base 2025-12-04T09:17:10.8894074Z * [new branch] gh/soulitzer/353/head -> origin/gh/soulitzer/353/head 2025-12-04T09:17:10.8896224Z * [new branch] gh/soulitzer/353/orig -> origin/gh/soulitzer/353/orig 2025-12-04T09:17:10.8899859Z * [new branch] gh/soulitzer/358/base -> origin/gh/soulitzer/358/base 2025-12-04T09:17:10.8901992Z * [new branch] gh/soulitzer/358/head -> origin/gh/soulitzer/358/head 2025-12-04T09:17:10.8904222Z * [new branch] gh/soulitzer/358/orig -> origin/gh/soulitzer/358/orig 2025-12-04T09:17:10.8907744Z * [new branch] gh/soulitzer/359/base -> origin/gh/soulitzer/359/base 2025-12-04T09:17:10.8909890Z * [new branch] gh/soulitzer/359/head -> origin/gh/soulitzer/359/head 2025-12-04T09:17:10.8912019Z * [new branch] gh/soulitzer/359/orig -> origin/gh/soulitzer/359/orig 2025-12-04T09:17:10.8914862Z * [new branch] gh/soulitzer/374/base -> origin/gh/soulitzer/374/base 2025-12-04T09:17:10.8916952Z * [new branch] gh/soulitzer/374/head -> origin/gh/soulitzer/374/head 2025-12-04T09:17:10.8919044Z * [new branch] gh/soulitzer/374/orig -> origin/gh/soulitzer/374/orig 2025-12-04T09:17:10.8921895Z * [new branch] gh/soulitzer/375/base -> origin/gh/soulitzer/375/base 2025-12-04T09:17:10.8924121Z * [new branch] gh/soulitzer/375/head -> origin/gh/soulitzer/375/head 2025-12-04T09:17:10.8926374Z * [new branch] gh/soulitzer/375/orig -> origin/gh/soulitzer/375/orig 2025-12-04T09:17:10.8929339Z * [new branch] gh/soulitzer/380/base -> origin/gh/soulitzer/380/base 2025-12-04T09:17:10.8931425Z * [new branch] gh/soulitzer/380/head -> origin/gh/soulitzer/380/head 2025-12-04T09:17:10.8933497Z * [new branch] gh/soulitzer/380/orig -> origin/gh/soulitzer/380/orig 2025-12-04T09:17:10.8936370Z * [new branch] gh/soulitzer/385/base -> origin/gh/soulitzer/385/base 2025-12-04T09:17:10.8938453Z * [new branch] gh/soulitzer/385/head -> origin/gh/soulitzer/385/head 2025-12-04T09:17:10.8940558Z * [new branch] gh/soulitzer/385/orig -> origin/gh/soulitzer/385/orig 2025-12-04T09:17:10.8943643Z * [new branch] gh/soulitzer/386/base -> origin/gh/soulitzer/386/base 2025-12-04T09:17:10.8945867Z * [new branch] gh/soulitzer/386/head -> origin/gh/soulitzer/386/head 2025-12-04T09:17:10.8948025Z * [new branch] gh/soulitzer/386/orig -> origin/gh/soulitzer/386/orig 2025-12-04T09:17:10.8950869Z * [new branch] gh/soulitzer/387/base -> origin/gh/soulitzer/387/base 2025-12-04T09:17:10.8952946Z * [new branch] gh/soulitzer/387/head -> origin/gh/soulitzer/387/head 2025-12-04T09:17:10.8955086Z * [new branch] gh/soulitzer/387/orig -> origin/gh/soulitzer/387/orig 2025-12-04T09:17:10.8958122Z * [new branch] gh/soulitzer/388/base -> origin/gh/soulitzer/388/base 2025-12-04T09:17:10.8960212Z * [new branch] gh/soulitzer/388/head -> origin/gh/soulitzer/388/head 2025-12-04T09:17:10.8962361Z * [new branch] gh/soulitzer/388/orig -> origin/gh/soulitzer/388/orig 2025-12-04T09:17:10.8965230Z * [new branch] gh/soulitzer/389/base -> origin/gh/soulitzer/389/base 2025-12-04T09:17:10.8967311Z * [new branch] gh/soulitzer/389/head -> origin/gh/soulitzer/389/head 2025-12-04T09:17:10.8969471Z * [new branch] gh/soulitzer/389/orig -> origin/gh/soulitzer/389/orig 2025-12-04T09:17:10.8972356Z * [new branch] gh/soulitzer/390/base -> origin/gh/soulitzer/390/base 2025-12-04T09:17:10.8974420Z * [new branch] gh/soulitzer/390/head -> origin/gh/soulitzer/390/head 2025-12-04T09:17:10.8976556Z * [new branch] gh/soulitzer/390/orig -> origin/gh/soulitzer/390/orig 2025-12-04T09:17:10.8979923Z * [new branch] gh/soulitzer/391/base -> origin/gh/soulitzer/391/base 2025-12-04T09:17:10.8982059Z * [new branch] gh/soulitzer/391/head -> origin/gh/soulitzer/391/head 2025-12-04T09:17:10.8984284Z * [new branch] gh/soulitzer/391/orig -> origin/gh/soulitzer/391/orig 2025-12-04T09:17:10.8987931Z * [new branch] gh/soulitzer/392/base -> origin/gh/soulitzer/392/base 2025-12-04T09:17:10.8990041Z * [new branch] gh/soulitzer/392/head -> origin/gh/soulitzer/392/head 2025-12-04T09:17:10.8992172Z * [new branch] gh/soulitzer/392/orig -> origin/gh/soulitzer/392/orig 2025-12-04T09:17:10.8995514Z * [new branch] gh/swolchok/728/next -> origin/gh/swolchok/728/next 2025-12-04T09:17:10.8998587Z * [new branch] gh/swolchok/819/base -> origin/gh/swolchok/819/base 2025-12-04T09:17:10.9000689Z * [new branch] gh/swolchok/819/head -> origin/gh/swolchok/819/head 2025-12-04T09:17:10.9002807Z * [new branch] gh/swolchok/819/orig -> origin/gh/swolchok/819/orig 2025-12-04T09:17:10.9005590Z * [new branch] gh/swolchok/824/base -> origin/gh/swolchok/824/base 2025-12-04T09:17:10.9007924Z * [new branch] gh/swolchok/824/head -> origin/gh/swolchok/824/head 2025-12-04T09:17:10.9009802Z * [new branch] gh/swolchok/824/orig -> origin/gh/swolchok/824/orig 2025-12-04T09:17:10.9012766Z * [new branch] gh/swolchok/829/base -> origin/gh/swolchok/829/base 2025-12-04T09:17:10.9014792Z * [new branch] gh/swolchok/829/head -> origin/gh/swolchok/829/head 2025-12-04T09:17:10.9016948Z * [new branch] gh/swolchok/829/orig -> origin/gh/swolchok/829/orig 2025-12-04T09:17:10.9019807Z * [new branch] gh/swolchok/839/base -> origin/gh/swolchok/839/base 2025-12-04T09:17:10.9021873Z * [new branch] gh/swolchok/839/head -> origin/gh/swolchok/839/head 2025-12-04T09:17:10.9024101Z * [new branch] gh/swolchok/839/orig -> origin/gh/swolchok/839/orig 2025-12-04T09:17:10.9027205Z * [new branch] gh/swolchok/841/base -> origin/gh/swolchok/841/base 2025-12-04T09:17:10.9029248Z * [new branch] gh/swolchok/841/head -> origin/gh/swolchok/841/head 2025-12-04T09:17:10.9031939Z * [new branch] gh/swolchok/841/orig -> origin/gh/swolchok/841/orig 2025-12-04T09:17:10.9034741Z * [new branch] gh/swolchok/842/base -> origin/gh/swolchok/842/base 2025-12-04T09:17:10.9036869Z * [new branch] gh/swolchok/842/head -> origin/gh/swolchok/842/head 2025-12-04T09:17:10.9038974Z * [new branch] gh/swolchok/842/orig -> origin/gh/swolchok/842/orig 2025-12-04T09:17:10.9041919Z * [new branch] gh/swolchok/845/base -> origin/gh/swolchok/845/base 2025-12-04T09:17:10.9043961Z * [new branch] gh/swolchok/845/head -> origin/gh/swolchok/845/head 2025-12-04T09:17:10.9046121Z * [new branch] gh/swolchok/845/orig -> origin/gh/swolchok/845/orig 2025-12-04T09:17:10.9048988Z * [new branch] gh/swolchok/848/base -> origin/gh/swolchok/848/base 2025-12-04T09:17:10.9051221Z * [new branch] gh/swolchok/848/head -> origin/gh/swolchok/848/head 2025-12-04T09:17:10.9053353Z * [new branch] gh/swolchok/848/orig -> origin/gh/swolchok/848/orig 2025-12-04T09:17:10.9056204Z * [new branch] gh/swolchok/856/base -> origin/gh/swolchok/856/base 2025-12-04T09:17:10.9058289Z * [new branch] gh/swolchok/856/head -> origin/gh/swolchok/856/head 2025-12-04T09:17:10.9060401Z * [new branch] gh/swolchok/856/orig -> origin/gh/swolchok/856/orig 2025-12-04T09:17:10.9063438Z * [new branch] gh/swolchok/860/base -> origin/gh/swolchok/860/base 2025-12-04T09:17:10.9065611Z * [new branch] gh/swolchok/860/head -> origin/gh/swolchok/860/head 2025-12-04T09:17:10.9067645Z * [new branch] gh/swolchok/860/orig -> origin/gh/swolchok/860/orig 2025-12-04T09:17:10.9070926Z * [new branch] gh/swolchok/861/base -> origin/gh/swolchok/861/base 2025-12-04T09:17:10.9073192Z * [new branch] gh/swolchok/861/head -> origin/gh/swolchok/861/head 2025-12-04T09:17:10.9075326Z * [new branch] gh/swolchok/861/orig -> origin/gh/swolchok/861/orig 2025-12-04T09:17:10.9078659Z * [new branch] gh/swolchok/862/base -> origin/gh/swolchok/862/base 2025-12-04T09:17:10.9080707Z * [new branch] gh/swolchok/862/head -> origin/gh/swolchok/862/head 2025-12-04T09:17:10.9082766Z * [new branch] gh/swolchok/862/orig -> origin/gh/swolchok/862/orig 2025-12-04T09:17:10.9085811Z * [new branch] gh/swolchok/863/base -> origin/gh/swolchok/863/base 2025-12-04T09:17:10.9087880Z * [new branch] gh/swolchok/863/head -> origin/gh/swolchok/863/head 2025-12-04T09:17:10.9090145Z * [new branch] gh/swolchok/863/orig -> origin/gh/swolchok/863/orig 2025-12-04T09:17:10.9093154Z * [new branch] gh/swolchok/864/base -> origin/gh/swolchok/864/base 2025-12-04T09:17:10.9095119Z * [new branch] gh/swolchok/864/head -> origin/gh/swolchok/864/head 2025-12-04T09:17:10.9097234Z * [new branch] gh/swolchok/864/orig -> origin/gh/swolchok/864/orig 2025-12-04T09:17:10.9100182Z * [new branch] gh/swolchok/865/base -> origin/gh/swolchok/865/base 2025-12-04T09:17:10.9102501Z * [new branch] gh/swolchok/865/head -> origin/gh/swolchok/865/head 2025-12-04T09:17:10.9104817Z * [new branch] gh/swolchok/865/orig -> origin/gh/swolchok/865/orig 2025-12-04T09:17:10.9108279Z * [new branch] gh/swolchok/866/base -> origin/gh/swolchok/866/base 2025-12-04T09:17:10.9110458Z * [new branch] gh/swolchok/866/head -> origin/gh/swolchok/866/head 2025-12-04T09:17:10.9112647Z * [new branch] gh/swolchok/866/orig -> origin/gh/swolchok/866/orig 2025-12-04T09:17:10.9115423Z * [new branch] gh/swolchok/867/base -> origin/gh/swolchok/867/base 2025-12-04T09:17:10.9117638Z * [new branch] gh/swolchok/867/head -> origin/gh/swolchok/867/head 2025-12-04T09:17:10.9119719Z * [new branch] gh/swolchok/867/orig -> origin/gh/swolchok/867/orig 2025-12-04T09:17:10.9123098Z * [new branch] gh/swolchok/868/base -> origin/gh/swolchok/868/base 2025-12-04T09:17:10.9125291Z * [new branch] gh/swolchok/868/head -> origin/gh/swolchok/868/head 2025-12-04T09:17:10.9127679Z * [new branch] gh/swolchok/868/orig -> origin/gh/swolchok/868/orig 2025-12-04T09:17:10.9130628Z * [new branch] gh/swolchok/869/base -> origin/gh/swolchok/869/base 2025-12-04T09:17:10.9132766Z * [new branch] gh/swolchok/869/head -> origin/gh/swolchok/869/head 2025-12-04T09:17:10.9135031Z * [new branch] gh/swolchok/869/orig -> origin/gh/swolchok/869/orig 2025-12-04T09:17:10.9138005Z * [new branch] gh/swolchok/870/base -> origin/gh/swolchok/870/base 2025-12-04T09:17:10.9139991Z * [new branch] gh/swolchok/870/head -> origin/gh/swolchok/870/head 2025-12-04T09:17:10.9142657Z * [new branch] gh/swolchok/870/orig -> origin/gh/swolchok/870/orig 2025-12-04T09:17:10.9145707Z * [new branch] gh/swolchok/871/base -> origin/gh/swolchok/871/base 2025-12-04T09:17:10.9147970Z * [new branch] gh/swolchok/871/head -> origin/gh/swolchok/871/head 2025-12-04T09:17:10.9150161Z * [new branch] gh/swolchok/871/orig -> origin/gh/swolchok/871/orig 2025-12-04T09:17:10.9153576Z * [new branch] gh/teja-rao/4/base -> origin/gh/teja-rao/4/base 2025-12-04T09:17:10.9155709Z * [new branch] gh/teja-rao/4/head -> origin/gh/teja-rao/4/head 2025-12-04T09:17:10.9158008Z * [new branch] gh/teja-rao/4/orig -> origin/gh/teja-rao/4/orig 2025-12-04T09:17:10.9161431Z * [new branch] gh/tianyu-l/2/base -> origin/gh/tianyu-l/2/base 2025-12-04T09:17:10.9163547Z * [new branch] gh/tianyu-l/2/head -> origin/gh/tianyu-l/2/head 2025-12-04T09:17:10.9165666Z * [new branch] gh/tianyu-l/2/orig -> origin/gh/tianyu-l/2/orig 2025-12-04T09:17:10.9168497Z * [new branch] gh/tianyu-l/3/base -> origin/gh/tianyu-l/3/base 2025-12-04T09:17:10.9170620Z * [new branch] gh/tianyu-l/3/orig -> origin/gh/tianyu-l/3/orig 2025-12-04T09:17:10.9173396Z * [new branch] gh/tianyu-l/4/base -> origin/gh/tianyu-l/4/base 2025-12-04T09:17:10.9175502Z * [new branch] gh/tianyu-l/4/head -> origin/gh/tianyu-l/4/head 2025-12-04T09:17:10.9177659Z * [new branch] gh/tianyu-l/4/orig -> origin/gh/tianyu-l/4/orig 2025-12-04T09:17:10.9181480Z * [new branch] gh/tugsbayasgalan/10/base -> origin/gh/tugsbayasgalan/10/base 2025-12-04T09:17:10.9183606Z * [new branch] gh/tugsbayasgalan/10/head -> origin/gh/tugsbayasgalan/10/head 2025-12-04T09:17:10.9185888Z * [new branch] gh/tugsbayasgalan/10/orig -> origin/gh/tugsbayasgalan/10/orig 2025-12-04T09:17:10.9188744Z * [new branch] gh/tugsbayasgalan/13/base -> origin/gh/tugsbayasgalan/13/base 2025-12-04T09:17:10.9190875Z * [new branch] gh/tugsbayasgalan/13/head -> origin/gh/tugsbayasgalan/13/head 2025-12-04T09:17:10.9192997Z * [new branch] gh/tugsbayasgalan/13/orig -> origin/gh/tugsbayasgalan/13/orig 2025-12-04T09:17:10.9195967Z * [new branch] gh/tugsbayasgalan/17/base -> origin/gh/tugsbayasgalan/17/base 2025-12-04T09:17:10.9198000Z * [new branch] gh/tugsbayasgalan/17/head -> origin/gh/tugsbayasgalan/17/head 2025-12-04T09:17:10.9200189Z * [new branch] gh/tugsbayasgalan/17/orig -> origin/gh/tugsbayasgalan/17/orig 2025-12-04T09:17:10.9203516Z * [new branch] gh/tugsbayasgalan/2/base -> origin/gh/tugsbayasgalan/2/base 2025-12-04T09:17:10.9205690Z * [new branch] gh/tugsbayasgalan/2/head -> origin/gh/tugsbayasgalan/2/head 2025-12-04T09:17:10.9207797Z * [new branch] gh/tugsbayasgalan/2/orig -> origin/gh/tugsbayasgalan/2/orig 2025-12-04T09:17:10.9210895Z * [new branch] gh/tugsbayasgalan/28/base -> origin/gh/tugsbayasgalan/28/base 2025-12-04T09:17:10.9213002Z * [new branch] gh/tugsbayasgalan/28/head -> origin/gh/tugsbayasgalan/28/head 2025-12-04T09:17:10.9215790Z * [new branch] gh/tugsbayasgalan/28/orig -> origin/gh/tugsbayasgalan/28/orig 2025-12-04T09:17:10.9218885Z * [new branch] gh/tugsbayasgalan/32/base -> origin/gh/tugsbayasgalan/32/base 2025-12-04T09:17:10.9221015Z * [new branch] gh/tugsbayasgalan/32/head -> origin/gh/tugsbayasgalan/32/head 2025-12-04T09:17:10.9223176Z * [new branch] gh/tugsbayasgalan/32/orig -> origin/gh/tugsbayasgalan/32/orig 2025-12-04T09:17:10.9228334Z * [new branch] gh/tugsbayasgalan/35/base -> origin/gh/tugsbayasgalan/35/base 2025-12-04T09:17:10.9230548Z * [new branch] gh/tugsbayasgalan/35/head -> origin/gh/tugsbayasgalan/35/head 2025-12-04T09:17:10.9232709Z * [new branch] gh/tugsbayasgalan/35/orig -> origin/gh/tugsbayasgalan/35/orig 2025-12-04T09:17:10.9235601Z * [new branch] gh/tugsbayasgalan/36/base -> origin/gh/tugsbayasgalan/36/base 2025-12-04T09:17:10.9237763Z * [new branch] gh/tugsbayasgalan/36/head -> origin/gh/tugsbayasgalan/36/head 2025-12-04T09:17:10.9239875Z * [new branch] gh/tugsbayasgalan/36/orig -> origin/gh/tugsbayasgalan/36/orig 2025-12-04T09:17:10.9242730Z * [new branch] gh/tugsbayasgalan/37/base -> origin/gh/tugsbayasgalan/37/base 2025-12-04T09:17:10.9244878Z * [new branch] gh/tugsbayasgalan/37/head -> origin/gh/tugsbayasgalan/37/head 2025-12-04T09:17:10.9247179Z * [new branch] gh/tugsbayasgalan/37/orig -> origin/gh/tugsbayasgalan/37/orig 2025-12-04T09:17:10.9250028Z * [new branch] gh/tugsbayasgalan/43/base -> origin/gh/tugsbayasgalan/43/base 2025-12-04T09:17:10.9252190Z * [new branch] gh/tugsbayasgalan/43/head -> origin/gh/tugsbayasgalan/43/head 2025-12-04T09:17:10.9254330Z * [new branch] gh/tugsbayasgalan/43/orig -> origin/gh/tugsbayasgalan/43/orig 2025-12-04T09:17:10.9257095Z * [new branch] gh/tugsbayasgalan/48/base -> origin/gh/tugsbayasgalan/48/base 2025-12-04T09:17:10.9259150Z * [new branch] gh/tugsbayasgalan/48/head -> origin/gh/tugsbayasgalan/48/head 2025-12-04T09:17:10.9261777Z * [new branch] gh/tugsbayasgalan/48/orig -> origin/gh/tugsbayasgalan/48/orig 2025-12-04T09:17:10.9264835Z * [new branch] gh/tugsbayasgalan/51/base -> origin/gh/tugsbayasgalan/51/base 2025-12-04T09:17:10.9267096Z * [new branch] gh/tugsbayasgalan/51/head -> origin/gh/tugsbayasgalan/51/head 2025-12-04T09:17:10.9269176Z * [new branch] gh/tugsbayasgalan/51/orig -> origin/gh/tugsbayasgalan/51/orig 2025-12-04T09:17:10.9271883Z * [new branch] gh/tugsbayasgalan/52/base -> origin/gh/tugsbayasgalan/52/base 2025-12-04T09:17:10.9274098Z * [new branch] gh/tugsbayasgalan/52/head -> origin/gh/tugsbayasgalan/52/head 2025-12-04T09:17:10.9276451Z * [new branch] gh/tugsbayasgalan/52/orig -> origin/gh/tugsbayasgalan/52/orig 2025-12-04T09:17:10.9279353Z * [new branch] gh/tugsbayasgalan/53/base -> origin/gh/tugsbayasgalan/53/base 2025-12-04T09:17:10.9281480Z * [new branch] gh/tugsbayasgalan/53/head -> origin/gh/tugsbayasgalan/53/head 2025-12-04T09:17:10.9283738Z * [new branch] gh/tugsbayasgalan/53/orig -> origin/gh/tugsbayasgalan/53/orig 2025-12-04T09:17:10.9286684Z * [new branch] gh/tugsbayasgalan/55/base -> origin/gh/tugsbayasgalan/55/base 2025-12-04T09:17:10.9288955Z * [new branch] gh/tugsbayasgalan/55/head -> origin/gh/tugsbayasgalan/55/head 2025-12-04T09:17:10.9291197Z * [new branch] gh/tugsbayasgalan/55/orig -> origin/gh/tugsbayasgalan/55/orig 2025-12-04T09:17:10.9294143Z * [new branch] gh/tugsbayasgalan/59/base -> origin/gh/tugsbayasgalan/59/base 2025-12-04T09:17:10.9296416Z * [new branch] gh/tugsbayasgalan/59/head -> origin/gh/tugsbayasgalan/59/head 2025-12-04T09:17:10.9298558Z * [new branch] gh/tugsbayasgalan/59/orig -> origin/gh/tugsbayasgalan/59/orig 2025-12-04T09:17:10.9301359Z * [new branch] gh/tugsbayasgalan/6/base -> origin/gh/tugsbayasgalan/6/base 2025-12-04T09:17:10.9303547Z * [new branch] gh/tugsbayasgalan/6/head -> origin/gh/tugsbayasgalan/6/head 2025-12-04T09:17:10.9305800Z * [new branch] gh/tugsbayasgalan/6/orig -> origin/gh/tugsbayasgalan/6/orig 2025-12-04T09:17:10.9308539Z * [new branch] gh/tugsbayasgalan/60/base -> origin/gh/tugsbayasgalan/60/base 2025-12-04T09:17:10.9310695Z * [new branch] gh/tugsbayasgalan/60/head -> origin/gh/tugsbayasgalan/60/head 2025-12-04T09:17:10.9312821Z * [new branch] gh/tugsbayasgalan/60/orig -> origin/gh/tugsbayasgalan/60/orig 2025-12-04T09:17:10.9316229Z * [new branch] gh/tugsbayasgalan/61/base -> origin/gh/tugsbayasgalan/61/base 2025-12-04T09:17:10.9318332Z * [new branch] gh/tugsbayasgalan/61/head -> origin/gh/tugsbayasgalan/61/head 2025-12-04T09:17:10.9320474Z * [new branch] gh/tugsbayasgalan/61/orig -> origin/gh/tugsbayasgalan/61/orig 2025-12-04T09:17:10.9323453Z * [new branch] gh/tugsbayasgalan/63/base -> origin/gh/tugsbayasgalan/63/base 2025-12-04T09:17:10.9325887Z * [new branch] gh/tugsbayasgalan/63/head -> origin/gh/tugsbayasgalan/63/head 2025-12-04T09:17:10.9328035Z * [new branch] gh/tugsbayasgalan/63/orig -> origin/gh/tugsbayasgalan/63/orig 2025-12-04T09:17:10.9330994Z * [new branch] gh/tugsbayasgalan/67/base -> origin/gh/tugsbayasgalan/67/base 2025-12-04T09:17:10.9333174Z * [new branch] gh/tugsbayasgalan/67/head -> origin/gh/tugsbayasgalan/67/head 2025-12-04T09:17:10.9335479Z * [new branch] gh/tugsbayasgalan/67/orig -> origin/gh/tugsbayasgalan/67/orig 2025-12-04T09:17:10.9338550Z * [new branch] gh/tugsbayasgalan/68/base -> origin/gh/tugsbayasgalan/68/base 2025-12-04T09:17:10.9340689Z * [new branch] gh/tugsbayasgalan/68/head -> origin/gh/tugsbayasgalan/68/head 2025-12-04T09:17:10.9342957Z * [new branch] gh/tugsbayasgalan/68/orig -> origin/gh/tugsbayasgalan/68/orig 2025-12-04T09:17:10.9346063Z * [new branch] gh/tugsbayasgalan/7/base -> origin/gh/tugsbayasgalan/7/base 2025-12-04T09:17:10.9348220Z * [new branch] gh/tugsbayasgalan/7/head -> origin/gh/tugsbayasgalan/7/head 2025-12-04T09:17:10.9350622Z * [new branch] gh/tugsbayasgalan/7/orig -> origin/gh/tugsbayasgalan/7/orig 2025-12-04T09:17:10.9353741Z * [new branch] gh/tugsbayasgalan/70/base -> origin/gh/tugsbayasgalan/70/base 2025-12-04T09:17:10.9356018Z * [new branch] gh/tugsbayasgalan/70/head -> origin/gh/tugsbayasgalan/70/head 2025-12-04T09:17:10.9358236Z * [new branch] gh/tugsbayasgalan/70/orig -> origin/gh/tugsbayasgalan/70/orig 2025-12-04T09:17:10.9361397Z * [new branch] gh/tugsbayasgalan/71/base -> origin/gh/tugsbayasgalan/71/base 2025-12-04T09:17:10.9363661Z * [new branch] gh/tugsbayasgalan/71/head -> origin/gh/tugsbayasgalan/71/head 2025-12-04T09:17:10.9366428Z * [new branch] gh/tugsbayasgalan/71/orig -> origin/gh/tugsbayasgalan/71/orig 2025-12-04T09:17:10.9369655Z * [new branch] gh/tugsbayasgalan/72/base -> origin/gh/tugsbayasgalan/72/base 2025-12-04T09:17:10.9371756Z * [new branch] gh/tugsbayasgalan/72/head -> origin/gh/tugsbayasgalan/72/head 2025-12-04T09:17:10.9373897Z * [new branch] gh/tugsbayasgalan/72/orig -> origin/gh/tugsbayasgalan/72/orig 2025-12-04T09:17:10.9376881Z * [new branch] gh/tugsbayasgalan/73/base -> origin/gh/tugsbayasgalan/73/base 2025-12-04T09:17:10.9379095Z * [new branch] gh/tugsbayasgalan/73/head -> origin/gh/tugsbayasgalan/73/head 2025-12-04T09:17:10.9381315Z * [new branch] gh/tugsbayasgalan/73/orig -> origin/gh/tugsbayasgalan/73/orig 2025-12-04T09:17:10.9384607Z * [new branch] gh/tugsbayasgalan/74/base -> origin/gh/tugsbayasgalan/74/base 2025-12-04T09:17:10.9386785Z * [new branch] gh/tugsbayasgalan/74/head -> origin/gh/tugsbayasgalan/74/head 2025-12-04T09:17:10.9388912Z * [new branch] gh/tugsbayasgalan/74/orig -> origin/gh/tugsbayasgalan/74/orig 2025-12-04T09:17:10.9391856Z * [new branch] gh/tugsbayasgalan/75/base -> origin/gh/tugsbayasgalan/75/base 2025-12-04T09:17:10.9394033Z * [new branch] gh/tugsbayasgalan/75/head -> origin/gh/tugsbayasgalan/75/head 2025-12-04T09:17:10.9396290Z * [new branch] gh/tugsbayasgalan/75/orig -> origin/gh/tugsbayasgalan/75/orig 2025-12-04T09:17:10.9399075Z * [new branch] gh/tugsbayasgalan/76/base -> origin/gh/tugsbayasgalan/76/base 2025-12-04T09:17:10.9401294Z * [new branch] gh/tugsbayasgalan/76/head -> origin/gh/tugsbayasgalan/76/head 2025-12-04T09:17:10.9403386Z * [new branch] gh/tugsbayasgalan/76/orig -> origin/gh/tugsbayasgalan/76/orig 2025-12-04T09:17:10.9406487Z * [new branch] gh/tugsbayasgalan/77/base -> origin/gh/tugsbayasgalan/77/base 2025-12-04T09:17:10.9408623Z * [new branch] gh/tugsbayasgalan/77/head -> origin/gh/tugsbayasgalan/77/head 2025-12-04T09:17:10.9410828Z * [new branch] gh/tugsbayasgalan/77/orig -> origin/gh/tugsbayasgalan/77/orig 2025-12-04T09:17:10.9413790Z * [new branch] gh/tugsbayasgalan/78/base -> origin/gh/tugsbayasgalan/78/base 2025-12-04T09:17:10.9416062Z * [new branch] gh/tugsbayasgalan/78/head -> origin/gh/tugsbayasgalan/78/head 2025-12-04T09:17:10.9418160Z * [new branch] gh/tugsbayasgalan/78/orig -> origin/gh/tugsbayasgalan/78/orig 2025-12-04T09:17:10.9421176Z * [new branch] gh/tugsbayasgalan/79/base -> origin/gh/tugsbayasgalan/79/base 2025-12-04T09:17:10.9423410Z * [new branch] gh/tugsbayasgalan/79/head -> origin/gh/tugsbayasgalan/79/head 2025-12-04T09:17:10.9425891Z * [new branch] gh/tugsbayasgalan/79/orig -> origin/gh/tugsbayasgalan/79/orig 2025-12-04T09:17:10.9428815Z * [new branch] gh/tugsbayasgalan/8/base -> origin/gh/tugsbayasgalan/8/base 2025-12-04T09:17:10.9430935Z * [new branch] gh/tugsbayasgalan/8/head -> origin/gh/tugsbayasgalan/8/head 2025-12-04T09:17:10.9433209Z * [new branch] gh/tugsbayasgalan/8/orig -> origin/gh/tugsbayasgalan/8/orig 2025-12-04T09:17:10.9436405Z * [new branch] gh/tugsbayasgalan/80/base -> origin/gh/tugsbayasgalan/80/base 2025-12-04T09:17:10.9438472Z * [new branch] gh/tugsbayasgalan/80/head -> origin/gh/tugsbayasgalan/80/head 2025-12-04T09:17:10.9440652Z * [new branch] gh/tugsbayasgalan/80/orig -> origin/gh/tugsbayasgalan/80/orig 2025-12-04T09:17:10.9443555Z * [new branch] gh/tugsbayasgalan/81/base -> origin/gh/tugsbayasgalan/81/base 2025-12-04T09:17:10.9445637Z * [new branch] gh/tugsbayasgalan/81/head -> origin/gh/tugsbayasgalan/81/head 2025-12-04T09:17:10.9447863Z * [new branch] gh/tugsbayasgalan/81/orig -> origin/gh/tugsbayasgalan/81/orig 2025-12-04T09:17:10.9451402Z * [new branch] gh/tugsbayasgalan/82/base -> origin/gh/tugsbayasgalan/82/base 2025-12-04T09:17:10.9453664Z * [new branch] gh/tugsbayasgalan/82/head -> origin/gh/tugsbayasgalan/82/head 2025-12-04T09:17:10.9456008Z * [new branch] gh/tugsbayasgalan/82/orig -> origin/gh/tugsbayasgalan/82/orig 2025-12-04T09:17:10.9458802Z * [new branch] gh/tugsbayasgalan/83/base -> origin/gh/tugsbayasgalan/83/base 2025-12-04T09:17:10.9460961Z * [new branch] gh/tugsbayasgalan/83/head -> origin/gh/tugsbayasgalan/83/head 2025-12-04T09:17:10.9463188Z * [new branch] gh/tugsbayasgalan/83/orig -> origin/gh/tugsbayasgalan/83/orig 2025-12-04T09:17:10.9466059Z * [new branch] gh/tugsbayasgalan/84/base -> origin/gh/tugsbayasgalan/84/base 2025-12-04T09:17:10.9468199Z * [new branch] gh/tugsbayasgalan/84/head -> origin/gh/tugsbayasgalan/84/head 2025-12-04T09:17:10.9470351Z * [new branch] gh/tugsbayasgalan/84/orig -> origin/gh/tugsbayasgalan/84/orig 2025-12-04T09:17:10.9473188Z * [new branch] gh/tugsbayasgalan/85/base -> origin/gh/tugsbayasgalan/85/base 2025-12-04T09:17:10.9475580Z * [new branch] gh/tugsbayasgalan/85/head -> origin/gh/tugsbayasgalan/85/head 2025-12-04T09:17:10.9477760Z * [new branch] gh/tugsbayasgalan/85/orig -> origin/gh/tugsbayasgalan/85/orig 2025-12-04T09:17:10.9480691Z * [new branch] gh/tugsbayasgalan/86/base -> origin/gh/tugsbayasgalan/86/base 2025-12-04T09:17:10.9482896Z * [new branch] gh/tugsbayasgalan/86/head -> origin/gh/tugsbayasgalan/86/head 2025-12-04T09:17:10.9485172Z * [new branch] gh/tugsbayasgalan/86/orig -> origin/gh/tugsbayasgalan/86/orig 2025-12-04T09:17:10.9488412Z * [new branch] gh/tugsbayasgalan/87/base -> origin/gh/tugsbayasgalan/87/base 2025-12-04T09:17:10.9490607Z * [new branch] gh/tugsbayasgalan/87/head -> origin/gh/tugsbayasgalan/87/head 2025-12-04T09:17:10.9492782Z * [new branch] gh/tugsbayasgalan/87/orig -> origin/gh/tugsbayasgalan/87/orig 2025-12-04T09:17:10.9495817Z * [new branch] gh/tugsbayasgalan/88/base -> origin/gh/tugsbayasgalan/88/base 2025-12-04T09:17:10.9497888Z * [new branch] gh/tugsbayasgalan/88/head -> origin/gh/tugsbayasgalan/88/head 2025-12-04T09:17:10.9500036Z * [new branch] gh/tugsbayasgalan/88/orig -> origin/gh/tugsbayasgalan/88/orig 2025-12-04T09:17:10.9503015Z * [new branch] gh/tugsbayasgalan/89/base -> origin/gh/tugsbayasgalan/89/base 2025-12-04T09:17:10.9505303Z * [new branch] gh/tugsbayasgalan/89/head -> origin/gh/tugsbayasgalan/89/head 2025-12-04T09:17:10.9507435Z * [new branch] gh/tugsbayasgalan/89/orig -> origin/gh/tugsbayasgalan/89/orig 2025-12-04T09:17:10.9510342Z * [new branch] gh/tugsbayasgalan/9/base -> origin/gh/tugsbayasgalan/9/base 2025-12-04T09:17:10.9512347Z * [new branch] gh/tugsbayasgalan/9/head -> origin/gh/tugsbayasgalan/9/head 2025-12-04T09:17:10.9514615Z * [new branch] gh/tugsbayasgalan/9/orig -> origin/gh/tugsbayasgalan/9/orig 2025-12-04T09:17:10.9517818Z * [new branch] gh/tugsbayasgalan/90/base -> origin/gh/tugsbayasgalan/90/base 2025-12-04T09:17:10.9519810Z * [new branch] gh/tugsbayasgalan/90/head -> origin/gh/tugsbayasgalan/90/head 2025-12-04T09:17:10.9521878Z * [new branch] gh/tugsbayasgalan/90/orig -> origin/gh/tugsbayasgalan/90/orig 2025-12-04T09:17:10.9525229Z * [new branch] gh/tugsbayasgalan/91/base -> origin/gh/tugsbayasgalan/91/base 2025-12-04T09:17:10.9527403Z * [new branch] gh/tugsbayasgalan/91/head -> origin/gh/tugsbayasgalan/91/head 2025-12-04T09:17:10.9529482Z * [new branch] gh/tugsbayasgalan/91/orig -> origin/gh/tugsbayasgalan/91/orig 2025-12-04T09:17:10.9532489Z * [new branch] gh/tugsbayasgalan/92/base -> origin/gh/tugsbayasgalan/92/base 2025-12-04T09:17:10.9534733Z * [new branch] gh/tugsbayasgalan/92/head -> origin/gh/tugsbayasgalan/92/head 2025-12-04T09:17:10.9537443Z * [new branch] gh/tugsbayasgalan/92/orig -> origin/gh/tugsbayasgalan/92/orig 2025-12-04T09:17:10.9540475Z * [new branch] gh/tugsbayasgalan/93/base -> origin/gh/tugsbayasgalan/93/base 2025-12-04T09:17:10.9543170Z * [new branch] gh/tugsbayasgalan/93/head -> origin/gh/tugsbayasgalan/93/head 2025-12-04T09:17:10.9545017Z * [new branch] gh/tugsbayasgalan/93/orig -> origin/gh/tugsbayasgalan/93/orig 2025-12-04T09:17:10.9548628Z * [new branch] gh/v0i0/14/base -> origin/gh/v0i0/14/base 2025-12-04T09:17:10.9550704Z * [new branch] gh/v0i0/14/head -> origin/gh/v0i0/14/head 2025-12-04T09:17:10.9552775Z * [new branch] gh/v0i0/14/orig -> origin/gh/v0i0/14/orig 2025-12-04T09:17:10.9555499Z * [new branch] gh/v0i0/15/base -> origin/gh/v0i0/15/base 2025-12-04T09:17:10.9557765Z * [new branch] gh/v0i0/15/head -> origin/gh/v0i0/15/head 2025-12-04T09:17:10.9559892Z * [new branch] gh/v0i0/15/orig -> origin/gh/v0i0/15/orig 2025-12-04T09:17:10.9562715Z * [new branch] gh/v0i0/16/base -> origin/gh/v0i0/16/base 2025-12-04T09:17:10.9564919Z * [new branch] gh/v0i0/16/head -> origin/gh/v0i0/16/head 2025-12-04T09:17:10.9567094Z * [new branch] gh/v0i0/16/orig -> origin/gh/v0i0/16/orig 2025-12-04T09:17:10.9569932Z * [new branch] gh/v0i0/17/base -> origin/gh/v0i0/17/base 2025-12-04T09:17:10.9572193Z * [new branch] gh/v0i0/17/head -> origin/gh/v0i0/17/head 2025-12-04T09:17:10.9574295Z * [new branch] gh/v0i0/17/orig -> origin/gh/v0i0/17/orig 2025-12-04T09:17:10.9577196Z * [new branch] gh/v0i0/18/base -> origin/gh/v0i0/18/base 2025-12-04T09:17:10.9579365Z * [new branch] gh/v0i0/18/head -> origin/gh/v0i0/18/head 2025-12-04T09:17:10.9581529Z * [new branch] gh/v0i0/18/orig -> origin/gh/v0i0/18/orig 2025-12-04T09:17:10.9584644Z * [new branch] gh/v0i0/19/base -> origin/gh/v0i0/19/base 2025-12-04T09:17:10.9586776Z * [new branch] gh/v0i0/19/head -> origin/gh/v0i0/19/head 2025-12-04T09:17:10.9588978Z * [new branch] gh/v0i0/19/orig -> origin/gh/v0i0/19/orig 2025-12-04T09:17:10.9592419Z * [new branch] gh/vishal9-team/1/base -> origin/gh/vishal9-team/1/base 2025-12-04T09:17:10.9594576Z * [new branch] gh/vishal9-team/1/head -> origin/gh/vishal9-team/1/head 2025-12-04T09:17:10.9597365Z * [new branch] gh/vishal9-team/2/base -> origin/gh/vishal9-team/2/base 2025-12-04T09:17:10.9599945Z * [new branch] gh/vishal9-team/2/head -> origin/gh/vishal9-team/2/head 2025-12-04T09:17:10.9602147Z * [new branch] gh/vishal9-team/2/orig -> origin/gh/vishal9-team/2/orig 2025-12-04T09:17:10.9605726Z * [new branch] gh/vishal9-team/3/base -> origin/gh/vishal9-team/3/base 2025-12-04T09:17:10.9608032Z * [new branch] gh/vishal9-team/3/head -> origin/gh/vishal9-team/3/head 2025-12-04T09:17:10.9610192Z * [new branch] gh/vishal9-team/3/orig -> origin/gh/vishal9-team/3/orig 2025-12-04T09:17:10.9612919Z * [new branch] gh/vishal9-team/4/base -> origin/gh/vishal9-team/4/base 2025-12-04T09:17:10.9615054Z * [new branch] gh/vishal9-team/4/head -> origin/gh/vishal9-team/4/head 2025-12-04T09:17:10.9617208Z * [new branch] gh/vishal9-team/4/orig -> origin/gh/vishal9-team/4/orig 2025-12-04T09:17:10.9620570Z * [new branch] gh/vkuzo/1/next -> origin/gh/vkuzo/1/next 2025-12-04T09:17:10.9623979Z * [new branch] gh/vkuzo/2/next -> origin/gh/vkuzo/2/next 2025-12-04T09:17:10.9629087Z * [new branch] gh/vkuzo/3/next -> origin/gh/vkuzo/3/next 2025-12-04T09:17:10.9632528Z * [new branch] gh/wconstab/424/base -> origin/gh/wconstab/424/base 2025-12-04T09:17:10.9634695Z * [new branch] gh/wconstab/424/head -> origin/gh/wconstab/424/head 2025-12-04T09:17:10.9637000Z * [new branch] gh/wconstab/424/orig -> origin/gh/wconstab/424/orig 2025-12-04T09:17:10.9639784Z * [new branch] gh/wconstab/435/base -> origin/gh/wconstab/435/base 2025-12-04T09:17:10.9641972Z * [new branch] gh/wconstab/435/head -> origin/gh/wconstab/435/head 2025-12-04T09:17:10.9644320Z * [new branch] gh/wconstab/435/orig -> origin/gh/wconstab/435/orig 2025-12-04T09:17:10.9646995Z * [new branch] gh/wconstab/444/base -> origin/gh/wconstab/444/base 2025-12-04T09:17:10.9649153Z * [new branch] gh/wconstab/444/head -> origin/gh/wconstab/444/head 2025-12-04T09:17:10.9651391Z * [new branch] gh/wconstab/444/orig -> origin/gh/wconstab/444/orig 2025-12-04T09:17:10.9655140Z * [new branch] gh/wconstab/447/base -> origin/gh/wconstab/447/base 2025-12-04T09:17:10.9657379Z * [new branch] gh/wconstab/447/head -> origin/gh/wconstab/447/head 2025-12-04T09:17:10.9659482Z * [new branch] gh/wconstab/447/orig -> origin/gh/wconstab/447/orig 2025-12-04T09:17:10.9662320Z * [new branch] gh/wconstab/448/base -> origin/gh/wconstab/448/base 2025-12-04T09:17:10.9664648Z * [new branch] gh/wconstab/448/head -> origin/gh/wconstab/448/head 2025-12-04T09:17:10.9666753Z * [new branch] gh/wconstab/448/orig -> origin/gh/wconstab/448/orig 2025-12-04T09:17:10.9669459Z * [new branch] gh/wconstab/449/base -> origin/gh/wconstab/449/base 2025-12-04T09:17:10.9671693Z * [new branch] gh/wconstab/449/head -> origin/gh/wconstab/449/head 2025-12-04T09:17:10.9673861Z * [new branch] gh/wconstab/449/orig -> origin/gh/wconstab/449/orig 2025-12-04T09:17:10.9676577Z * [new branch] gh/wconstab/450/base -> origin/gh/wconstab/450/base 2025-12-04T09:17:10.9678824Z * [new branch] gh/wconstab/450/head -> origin/gh/wconstab/450/head 2025-12-04T09:17:10.9681018Z * [new branch] gh/wconstab/450/orig -> origin/gh/wconstab/450/orig 2025-12-04T09:17:10.9683777Z * [new branch] gh/wconstab/451/base -> origin/gh/wconstab/451/base 2025-12-04T09:17:10.9686176Z * [new branch] gh/wconstab/451/head -> origin/gh/wconstab/451/head 2025-12-04T09:17:10.9688329Z * [new branch] gh/wconstab/451/orig -> origin/gh/wconstab/451/orig 2025-12-04T09:17:10.9691250Z * [new branch] gh/wconstab/452/base -> origin/gh/wconstab/452/base 2025-12-04T09:17:10.9693822Z * [new branch] gh/wconstab/452/head -> origin/gh/wconstab/452/head 2025-12-04T09:17:10.9696114Z * [new branch] gh/wconstab/452/orig -> origin/gh/wconstab/452/orig 2025-12-04T09:17:10.9698670Z * [new branch] gh/wconstab/453/base -> origin/gh/wconstab/453/base 2025-12-04T09:17:10.9700869Z * [new branch] gh/wconstab/453/head -> origin/gh/wconstab/453/head 2025-12-04T09:17:10.9703208Z * [new branch] gh/wconstab/453/orig -> origin/gh/wconstab/453/orig 2025-12-04T09:17:10.9706054Z * [new branch] gh/wconstab/454/base -> origin/gh/wconstab/454/base 2025-12-04T09:17:10.9708687Z * [new branch] gh/wconstab/454/head -> origin/gh/wconstab/454/head 2025-12-04T09:17:10.9710843Z * [new branch] gh/wconstab/454/orig -> origin/gh/wconstab/454/orig 2025-12-04T09:17:10.9713765Z * [new branch] gh/wconstab/455/base -> origin/gh/wconstab/455/base 2025-12-04T09:17:10.9716045Z * [new branch] gh/wconstab/455/head -> origin/gh/wconstab/455/head 2025-12-04T09:17:10.9718325Z * [new branch] gh/wconstab/455/orig -> origin/gh/wconstab/455/orig 2025-12-04T09:17:10.9721432Z * [new branch] gh/wconstab/456/base -> origin/gh/wconstab/456/base 2025-12-04T09:17:10.9723835Z * [new branch] gh/wconstab/456/head -> origin/gh/wconstab/456/head 2025-12-04T09:17:10.9726368Z * [new branch] gh/wconstab/456/orig -> origin/gh/wconstab/456/orig 2025-12-04T09:17:10.9729172Z * [new branch] gh/wconstab/457/base -> origin/gh/wconstab/457/base 2025-12-04T09:17:10.9731346Z * [new branch] gh/wconstab/457/head -> origin/gh/wconstab/457/head 2025-12-04T09:17:10.9733473Z * [new branch] gh/wconstab/457/orig -> origin/gh/wconstab/457/orig 2025-12-04T09:17:10.9736404Z * [new branch] gh/wconstab/458/base -> origin/gh/wconstab/458/base 2025-12-04T09:17:10.9738604Z * [new branch] gh/wconstab/458/head -> origin/gh/wconstab/458/head 2025-12-04T09:17:10.9740728Z * [new branch] gh/wconstab/458/orig -> origin/gh/wconstab/458/orig 2025-12-04T09:17:10.9743551Z * [new branch] gh/wconstab/459/base -> origin/gh/wconstab/459/base 2025-12-04T09:17:10.9745877Z * [new branch] gh/wconstab/459/head -> origin/gh/wconstab/459/head 2025-12-04T09:17:10.9747936Z * [new branch] gh/wconstab/459/orig -> origin/gh/wconstab/459/orig 2025-12-04T09:17:10.9751546Z * [new branch] gh/wconstab/460/base -> origin/gh/wconstab/460/base 2025-12-04T09:17:10.9753949Z * [new branch] gh/wconstab/460/head -> origin/gh/wconstab/460/head 2025-12-04T09:17:10.9756149Z * [new branch] gh/wconstab/460/orig -> origin/gh/wconstab/460/orig 2025-12-04T09:17:10.9759180Z * [new branch] gh/wconstab/461/base -> origin/gh/wconstab/461/base 2025-12-04T09:17:10.9761307Z * [new branch] gh/wconstab/461/head -> origin/gh/wconstab/461/head 2025-12-04T09:17:10.9763390Z * [new branch] gh/wconstab/461/orig -> origin/gh/wconstab/461/orig 2025-12-04T09:17:10.9766152Z * [new branch] gh/wconstab/462/base -> origin/gh/wconstab/462/base 2025-12-04T09:17:10.9768530Z * [new branch] gh/wconstab/462/head -> origin/gh/wconstab/462/head 2025-12-04T09:17:10.9771202Z * [new branch] gh/wconstab/462/orig -> origin/gh/wconstab/462/orig 2025-12-04T09:17:10.9774115Z * [new branch] gh/wconstab/463/base -> origin/gh/wconstab/463/base 2025-12-04T09:17:10.9776367Z * [new branch] gh/wconstab/463/head -> origin/gh/wconstab/463/head 2025-12-04T09:17:10.9778527Z * [new branch] gh/wconstab/463/orig -> origin/gh/wconstab/463/orig 2025-12-04T09:17:10.9781467Z * [new branch] gh/wconstab/464/base -> origin/gh/wconstab/464/base 2025-12-04T09:17:10.9783993Z * [new branch] gh/wconstab/464/head -> origin/gh/wconstab/464/head 2025-12-04T09:17:10.9786055Z * [new branch] gh/wconstab/464/orig -> origin/gh/wconstab/464/orig 2025-12-04T09:17:10.9788977Z * [new branch] gh/wconstab/465/base -> origin/gh/wconstab/465/base 2025-12-04T09:17:10.9791145Z * [new branch] gh/wconstab/465/head -> origin/gh/wconstab/465/head 2025-12-04T09:17:10.9793186Z * [new branch] gh/wconstab/465/orig -> origin/gh/wconstab/465/orig 2025-12-04T09:17:10.9796209Z * [new branch] gh/wconstab/466/base -> origin/gh/wconstab/466/base 2025-12-04T09:17:10.9798266Z * [new branch] gh/wconstab/466/head -> origin/gh/wconstab/466/head 2025-12-04T09:17:10.9800328Z * [new branch] gh/wconstab/466/orig -> origin/gh/wconstab/466/orig 2025-12-04T09:17:10.9803524Z * [new branch] gh/wconstab/467/base -> origin/gh/wconstab/467/base 2025-12-04T09:17:10.9805856Z * [new branch] gh/wconstab/467/head -> origin/gh/wconstab/467/head 2025-12-04T09:17:10.9808169Z * [new branch] gh/wconstab/467/orig -> origin/gh/wconstab/467/orig 2025-12-04T09:17:10.9810877Z * [new branch] gh/wconstab/468/base -> origin/gh/wconstab/468/base 2025-12-04T09:17:10.9813014Z * [new branch] gh/wconstab/468/head -> origin/gh/wconstab/468/head 2025-12-04T09:17:10.9815150Z * [new branch] gh/wconstab/468/orig -> origin/gh/wconstab/468/orig 2025-12-04T09:17:10.9818728Z * [new branch] gh/weifengpy/39/base -> origin/gh/weifengpy/39/base 2025-12-04T09:17:10.9820823Z * [new branch] gh/weifengpy/39/head -> origin/gh/weifengpy/39/head 2025-12-04T09:17:10.9823232Z * [new branch] gh/weifengpy/39/orig -> origin/gh/weifengpy/39/orig 2025-12-04T09:17:10.9826276Z * [new branch] gh/weifengpy/40/base -> origin/gh/weifengpy/40/base 2025-12-04T09:17:10.9828590Z * [new branch] gh/weifengpy/40/head -> origin/gh/weifengpy/40/head 2025-12-04T09:17:10.9830710Z * [new branch] gh/weifengpy/40/orig -> origin/gh/weifengpy/40/orig 2025-12-04T09:17:10.9833812Z * [new branch] gh/weifengpy/41/base -> origin/gh/weifengpy/41/base 2025-12-04T09:17:10.9836061Z * [new branch] gh/weifengpy/41/head -> origin/gh/weifengpy/41/head 2025-12-04T09:17:10.9838326Z * [new branch] gh/weifengpy/41/orig -> origin/gh/weifengpy/41/orig 2025-12-04T09:17:10.9841858Z * [new branch] gh/williamwen42/250/base -> origin/gh/williamwen42/250/base 2025-12-04T09:17:10.9844000Z * [new branch] gh/williamwen42/250/head -> origin/gh/williamwen42/250/head 2025-12-04T09:17:10.9846180Z * [new branch] gh/williamwen42/250/orig -> origin/gh/williamwen42/250/orig 2025-12-04T09:17:10.9849099Z * [new branch] gh/williamwen42/279/base -> origin/gh/williamwen42/279/base 2025-12-04T09:17:10.9851396Z * [new branch] gh/williamwen42/279/head -> origin/gh/williamwen42/279/head 2025-12-04T09:17:10.9853549Z * [new branch] gh/williamwen42/279/orig -> origin/gh/williamwen42/279/orig 2025-12-04T09:17:10.9856451Z * [new branch] gh/williamwen42/282/base -> origin/gh/williamwen42/282/base 2025-12-04T09:17:10.9865784Z * [new branch] gh/williamwen42/282/head -> origin/gh/williamwen42/282/head 2025-12-04T09:17:10.9866327Z * [new branch] gh/williamwen42/282/orig -> origin/gh/williamwen42/282/orig 2025-12-04T09:17:10.9866684Z * [new branch] gh/williamwen42/287/base -> origin/gh/williamwen42/287/base 2025-12-04T09:17:10.9866941Z * [new branch] gh/williamwen42/287/head -> origin/gh/williamwen42/287/head 2025-12-04T09:17:10.9868382Z * [new branch] gh/williamwen42/287/orig -> origin/gh/williamwen42/287/orig 2025-12-04T09:17:10.9871413Z * [new branch] gh/williamwen42/288/base -> origin/gh/williamwen42/288/base 2025-12-04T09:17:10.9873333Z * [new branch] gh/williamwen42/288/head -> origin/gh/williamwen42/288/head 2025-12-04T09:17:10.9875503Z * [new branch] gh/williamwen42/288/orig -> origin/gh/williamwen42/288/orig 2025-12-04T09:17:10.9878639Z * [new branch] gh/williamwen42/296/base -> origin/gh/williamwen42/296/base 2025-12-04T09:17:10.9880926Z * [new branch] gh/williamwen42/296/head -> origin/gh/williamwen42/296/head 2025-12-04T09:17:10.9883089Z * [new branch] gh/williamwen42/296/orig -> origin/gh/williamwen42/296/orig 2025-12-04T09:17:10.9885944Z * [new branch] gh/williamwen42/297/base -> origin/gh/williamwen42/297/base 2025-12-04T09:17:10.9888145Z * [new branch] gh/williamwen42/297/head -> origin/gh/williamwen42/297/head 2025-12-04T09:17:10.9890349Z * [new branch] gh/williamwen42/297/orig -> origin/gh/williamwen42/297/orig 2025-12-04T09:17:10.9893364Z * [new branch] gh/williamwen42/306/base -> origin/gh/williamwen42/306/base 2025-12-04T09:17:10.9896767Z * [new branch] gh/williamwen42/306/head -> origin/gh/williamwen42/306/head 2025-12-04T09:17:10.9898548Z * [new branch] gh/williamwen42/306/orig -> origin/gh/williamwen42/306/orig 2025-12-04T09:17:10.9900542Z * [new branch] gh/williamwen42/309/base -> origin/gh/williamwen42/309/base 2025-12-04T09:17:10.9902773Z * [new branch] gh/williamwen42/309/head -> origin/gh/williamwen42/309/head 2025-12-04T09:17:10.9905058Z * [new branch] gh/williamwen42/309/orig -> origin/gh/williamwen42/309/orig 2025-12-04T09:17:10.9907927Z * [new branch] gh/williamwen42/310/base -> origin/gh/williamwen42/310/base 2025-12-04T09:17:10.9910112Z * [new branch] gh/williamwen42/310/head -> origin/gh/williamwen42/310/head 2025-12-04T09:17:10.9912399Z * [new branch] gh/williamwen42/310/orig -> origin/gh/williamwen42/310/orig 2025-12-04T09:17:10.9916374Z * [new branch] gh/williamwen42/311/base -> origin/gh/williamwen42/311/base 2025-12-04T09:17:10.9918500Z * [new branch] gh/williamwen42/311/head -> origin/gh/williamwen42/311/head 2025-12-04T09:17:10.9920647Z * [new branch] gh/williamwen42/311/orig -> origin/gh/williamwen42/311/orig 2025-12-04T09:17:10.9923473Z * [new branch] gh/williamwen42/319/base -> origin/gh/williamwen42/319/base 2025-12-04T09:17:10.9925827Z * [new branch] gh/williamwen42/319/head -> origin/gh/williamwen42/319/head 2025-12-04T09:17:10.9928002Z * [new branch] gh/williamwen42/319/orig -> origin/gh/williamwen42/319/orig 2025-12-04T09:17:10.9931022Z * [new branch] gh/williamwen42/325/base -> origin/gh/williamwen42/325/base 2025-12-04T09:17:10.9933279Z * [new branch] gh/williamwen42/325/head -> origin/gh/williamwen42/325/head 2025-12-04T09:17:10.9935423Z * [new branch] gh/williamwen42/325/orig -> origin/gh/williamwen42/325/orig 2025-12-04T09:17:10.9938266Z * [new branch] gh/williamwen42/326/base -> origin/gh/williamwen42/326/base 2025-12-04T09:17:10.9940561Z * [new branch] gh/williamwen42/326/head -> origin/gh/williamwen42/326/head 2025-12-04T09:17:10.9942856Z * [new branch] gh/williamwen42/326/orig -> origin/gh/williamwen42/326/orig 2025-12-04T09:17:10.9945877Z * [new branch] gh/williamwen42/327/base -> origin/gh/williamwen42/327/base 2025-12-04T09:17:10.9948063Z * [new branch] gh/williamwen42/327/head -> origin/gh/williamwen42/327/head 2025-12-04T09:17:10.9950173Z * [new branch] gh/williamwen42/327/orig -> origin/gh/williamwen42/327/orig 2025-12-04T09:17:10.9953217Z * [new branch] gh/williamwen42/328/base -> origin/gh/williamwen42/328/base 2025-12-04T09:17:10.9955600Z * [new branch] gh/williamwen42/328/head -> origin/gh/williamwen42/328/head 2025-12-04T09:17:10.9957651Z * [new branch] gh/williamwen42/328/orig -> origin/gh/williamwen42/328/orig 2025-12-04T09:17:10.9961145Z * [new branch] gh/williamwen42/329/base -> origin/gh/williamwen42/329/base 2025-12-04T09:17:10.9963387Z * [new branch] gh/williamwen42/329/head -> origin/gh/williamwen42/329/head 2025-12-04T09:17:10.9965523Z * [new branch] gh/williamwen42/329/orig -> origin/gh/williamwen42/329/orig 2025-12-04T09:17:10.9968487Z * [new branch] gh/williamwen42/330/base -> origin/gh/williamwen42/330/base 2025-12-04T09:17:10.9970787Z * [new branch] gh/williamwen42/330/head -> origin/gh/williamwen42/330/head 2025-12-04T09:17:10.9972966Z * [new branch] gh/williamwen42/330/orig -> origin/gh/williamwen42/330/orig 2025-12-04T09:17:10.9975805Z * [new branch] gh/williamwen42/331/base -> origin/gh/williamwen42/331/base 2025-12-04T09:17:10.9977927Z * [new branch] gh/williamwen42/331/head -> origin/gh/williamwen42/331/head 2025-12-04T09:17:10.9980050Z * [new branch] gh/williamwen42/331/orig -> origin/gh/williamwen42/331/orig 2025-12-04T09:17:10.9983157Z * [new branch] gh/williamwen42/332/base -> origin/gh/williamwen42/332/base 2025-12-04T09:17:10.9985826Z * [new branch] gh/williamwen42/332/head -> origin/gh/williamwen42/332/head 2025-12-04T09:17:10.9987995Z * [new branch] gh/williamwen42/332/orig -> origin/gh/williamwen42/332/orig 2025-12-04T09:17:10.9991201Z * [new branch] gh/williamwen42/333/base -> origin/gh/williamwen42/333/base 2025-12-04T09:17:10.9993309Z * [new branch] gh/williamwen42/333/head -> origin/gh/williamwen42/333/head 2025-12-04T09:17:10.9995516Z * [new branch] gh/williamwen42/333/orig -> origin/gh/williamwen42/333/orig 2025-12-04T09:17:10.9998485Z * [new branch] gh/williamwen42/334/base -> origin/gh/williamwen42/334/base 2025-12-04T09:17:11.0000647Z * [new branch] gh/williamwen42/334/head -> origin/gh/williamwen42/334/head 2025-12-04T09:17:11.0002807Z * [new branch] gh/williamwen42/334/orig -> origin/gh/williamwen42/334/orig 2025-12-04T09:17:11.0009083Z * [new branch] gh/williamwen42/335/base -> origin/gh/williamwen42/335/base 2025-12-04T09:17:11.0011285Z * [new branch] gh/williamwen42/335/head -> origin/gh/williamwen42/335/head 2025-12-04T09:17:11.0013463Z * [new branch] gh/williamwen42/335/orig -> origin/gh/williamwen42/335/orig 2025-12-04T09:17:11.0016613Z * [new branch] gh/williamwen42/336/base -> origin/gh/williamwen42/336/base 2025-12-04T09:17:11.0018620Z * [new branch] gh/williamwen42/336/head -> origin/gh/williamwen42/336/head 2025-12-04T09:17:11.0020770Z * [new branch] gh/williamwen42/336/orig -> origin/gh/williamwen42/336/orig 2025-12-04T09:17:11.0023914Z * [new branch] gh/williamwen42/337/base -> origin/gh/williamwen42/337/base 2025-12-04T09:17:11.0028232Z * [new branch] gh/williamwen42/337/head -> origin/gh/williamwen42/337/head 2025-12-04T09:17:11.0030345Z * [new branch] gh/williamwen42/337/orig -> origin/gh/williamwen42/337/orig 2025-12-04T09:17:11.0033352Z * [new branch] gh/williamwen42/338/base -> origin/gh/williamwen42/338/base 2025-12-04T09:17:11.0035479Z * [new branch] gh/williamwen42/338/head -> origin/gh/williamwen42/338/head 2025-12-04T09:17:11.0037692Z * [new branch] gh/williamwen42/338/orig -> origin/gh/williamwen42/338/orig 2025-12-04T09:17:11.0040632Z * [new branch] gh/williamwen42/339/base -> origin/gh/williamwen42/339/base 2025-12-04T09:17:11.0042923Z * [new branch] gh/williamwen42/339/head -> origin/gh/williamwen42/339/head 2025-12-04T09:17:11.0044928Z * [new branch] gh/williamwen42/339/orig -> origin/gh/williamwen42/339/orig 2025-12-04T09:17:11.0048000Z * [new branch] gh/williamwen42/340/base -> origin/gh/williamwen42/340/base 2025-12-04T09:17:11.0050212Z * [new branch] gh/williamwen42/340/head -> origin/gh/williamwen42/340/head 2025-12-04T09:17:11.0052344Z * [new branch] gh/williamwen42/340/orig -> origin/gh/williamwen42/340/orig 2025-12-04T09:17:11.0055374Z * [new branch] gh/williamwen42/341/base -> origin/gh/williamwen42/341/base 2025-12-04T09:17:11.0057538Z * [new branch] gh/williamwen42/341/head -> origin/gh/williamwen42/341/head 2025-12-04T09:17:11.0059763Z * [new branch] gh/williamwen42/341/orig -> origin/gh/williamwen42/341/orig 2025-12-04T09:17:11.0063258Z * [new branch] gh/williamwen42/342/base -> origin/gh/williamwen42/342/base 2025-12-04T09:17:11.0065500Z * [new branch] gh/williamwen42/342/head -> origin/gh/williamwen42/342/head 2025-12-04T09:17:11.0067679Z * [new branch] gh/williamwen42/342/orig -> origin/gh/williamwen42/342/orig 2025-12-04T09:17:11.0070699Z * [new branch] gh/williamwen42/343/base -> origin/gh/williamwen42/343/base 2025-12-04T09:17:11.0072848Z * [new branch] gh/williamwen42/343/head -> origin/gh/williamwen42/343/head 2025-12-04T09:17:11.0075008Z * [new branch] gh/williamwen42/343/orig -> origin/gh/williamwen42/343/orig 2025-12-04T09:17:11.0078106Z * [new branch] gh/williamwen42/344/base -> origin/gh/williamwen42/344/base 2025-12-04T09:17:11.0080329Z * [new branch] gh/williamwen42/344/head -> origin/gh/williamwen42/344/head 2025-12-04T09:17:11.0082416Z * [new branch] gh/williamwen42/344/orig -> origin/gh/williamwen42/344/orig 2025-12-04T09:17:11.0085480Z * [new branch] gh/williamwen42/345/base -> origin/gh/williamwen42/345/base 2025-12-04T09:17:11.0087693Z * [new branch] gh/williamwen42/345/head -> origin/gh/williamwen42/345/head 2025-12-04T09:17:11.0089911Z * [new branch] gh/williamwen42/345/orig -> origin/gh/williamwen42/345/orig 2025-12-04T09:17:11.0092859Z * [new branch] gh/williamwen42/346/base -> origin/gh/williamwen42/346/base 2025-12-04T09:17:11.0095112Z * [new branch] gh/williamwen42/346/head -> origin/gh/williamwen42/346/head 2025-12-04T09:17:11.0097388Z * [new branch] gh/williamwen42/346/orig -> origin/gh/williamwen42/346/orig 2025-12-04T09:17:11.0100471Z * [new branch] gh/williamwen42/347/base -> origin/gh/williamwen42/347/base 2025-12-04T09:17:11.0102605Z * [new branch] gh/williamwen42/347/head -> origin/gh/williamwen42/347/head 2025-12-04T09:17:11.0104895Z * [new branch] gh/williamwen42/347/orig -> origin/gh/williamwen42/347/orig 2025-12-04T09:17:11.0107844Z * [new branch] gh/williamwen42/348/base -> origin/gh/williamwen42/348/base 2025-12-04T09:17:11.0109910Z * [new branch] gh/williamwen42/348/head -> origin/gh/williamwen42/348/head 2025-12-04T09:17:11.0112014Z * [new branch] gh/williamwen42/348/orig -> origin/gh/williamwen42/348/orig 2025-12-04T09:17:11.0114787Z * [new branch] gh/williamwen42/349/base -> origin/gh/williamwen42/349/base 2025-12-04T09:17:11.0117008Z * [new branch] gh/williamwen42/349/head -> origin/gh/williamwen42/349/head 2025-12-04T09:17:11.0119127Z * [new branch] gh/williamwen42/349/orig -> origin/gh/williamwen42/349/orig 2025-12-04T09:17:11.0122067Z * [new branch] gh/williamwen42/350/base -> origin/gh/williamwen42/350/base 2025-12-04T09:17:11.0124245Z * [new branch] gh/williamwen42/350/head -> origin/gh/williamwen42/350/head 2025-12-04T09:17:11.0129050Z * [new branch] gh/williamwen42/350/orig -> origin/gh/williamwen42/350/orig 2025-12-04T09:17:11.0129506Z * [new branch] gh/williamwen42/351/base -> origin/gh/williamwen42/351/base 2025-12-04T09:17:11.0132027Z * [new branch] gh/williamwen42/351/head -> origin/gh/williamwen42/351/head 2025-12-04T09:17:11.0134213Z * [new branch] gh/williamwen42/351/orig -> origin/gh/williamwen42/351/orig 2025-12-04T09:17:11.0137291Z * [new branch] gh/williamwen42/352/base -> origin/gh/williamwen42/352/base 2025-12-04T09:17:11.0139392Z * [new branch] gh/williamwen42/352/head -> origin/gh/williamwen42/352/head 2025-12-04T09:17:11.0141542Z * [new branch] gh/williamwen42/352/orig -> origin/gh/williamwen42/352/orig 2025-12-04T09:17:11.0144790Z * [new branch] gh/williamwen42/353/base -> origin/gh/williamwen42/353/base 2025-12-04T09:17:11.0147005Z * [new branch] gh/williamwen42/353/head -> origin/gh/williamwen42/353/head 2025-12-04T09:17:11.0149191Z * [new branch] gh/williamwen42/353/orig -> origin/gh/williamwen42/353/orig 2025-12-04T09:17:11.0152037Z * [new branch] gh/williamwen42/354/base -> origin/gh/williamwen42/354/base 2025-12-04T09:17:11.0154307Z * [new branch] gh/williamwen42/354/head -> origin/gh/williamwen42/354/head 2025-12-04T09:17:11.0156495Z * [new branch] gh/williamwen42/354/orig -> origin/gh/williamwen42/354/orig 2025-12-04T09:17:11.0159425Z * [new branch] gh/williamwen42/355/base -> origin/gh/williamwen42/355/base 2025-12-04T09:17:11.0161528Z * [new branch] gh/williamwen42/355/head -> origin/gh/williamwen42/355/head 2025-12-04T09:17:11.0164136Z * [new branch] gh/williamwen42/355/orig -> origin/gh/williamwen42/355/orig 2025-12-04T09:17:11.0167318Z * [new branch] gh/williamwen42/356/base -> origin/gh/williamwen42/356/base 2025-12-04T09:17:11.0169443Z * [new branch] gh/williamwen42/356/head -> origin/gh/williamwen42/356/head 2025-12-04T09:17:11.0171566Z * [new branch] gh/williamwen42/356/orig -> origin/gh/williamwen42/356/orig 2025-12-04T09:17:11.0174597Z * [new branch] gh/williamwen42/357/base -> origin/gh/williamwen42/357/base 2025-12-04T09:17:11.0176792Z * [new branch] gh/williamwen42/357/head -> origin/gh/williamwen42/357/head 2025-12-04T09:17:11.0178887Z * [new branch] gh/williamwen42/357/orig -> origin/gh/williamwen42/357/orig 2025-12-04T09:17:11.0181829Z * [new branch] gh/williamwen42/358/base -> origin/gh/williamwen42/358/base 2025-12-04T09:17:11.0184128Z * [new branch] gh/williamwen42/358/head -> origin/gh/williamwen42/358/head 2025-12-04T09:17:11.0186530Z * [new branch] gh/williamwen42/358/orig -> origin/gh/williamwen42/358/orig 2025-12-04T09:17:11.0189792Z * [new branch] gh/xmfan/169/base -> origin/gh/xmfan/169/base 2025-12-04T09:17:11.0191921Z * [new branch] gh/xmfan/169/head -> origin/gh/xmfan/169/head 2025-12-04T09:17:11.0194748Z * [new branch] gh/xmfan/170/base -> origin/gh/xmfan/170/base 2025-12-04T09:17:11.0196809Z * [new branch] gh/xmfan/170/head -> origin/gh/xmfan/170/head 2025-12-04T09:17:11.0199612Z * [new branch] gh/xmfan/274/base -> origin/gh/xmfan/274/base 2025-12-04T09:17:11.0201725Z * [new branch] gh/xmfan/274/head -> origin/gh/xmfan/274/head 2025-12-04T09:17:11.0203918Z * [new branch] gh/xmfan/274/orig -> origin/gh/xmfan/274/orig 2025-12-04T09:17:11.0206712Z * [new branch] gh/xmfan/277/base -> origin/gh/xmfan/277/base 2025-12-04T09:17:11.0208829Z * [new branch] gh/xmfan/277/head -> origin/gh/xmfan/277/head 2025-12-04T09:17:11.0210974Z * [new branch] gh/xmfan/277/orig -> origin/gh/xmfan/277/orig 2025-12-04T09:17:11.0214687Z * [new branch] gh/xmfan/301/base -> origin/gh/xmfan/301/base 2025-12-04T09:17:11.0216408Z * [new branch] gh/xmfan/301/head -> origin/gh/xmfan/301/head 2025-12-04T09:17:11.0218512Z * [new branch] gh/xmfan/301/orig -> origin/gh/xmfan/301/orig 2025-12-04T09:17:11.0221479Z * [new branch] gh/xmfan/304/base -> origin/gh/xmfan/304/base 2025-12-04T09:17:11.0223762Z * [new branch] gh/xmfan/304/head -> origin/gh/xmfan/304/head 2025-12-04T09:17:11.0226124Z * [new branch] gh/xmfan/304/orig -> origin/gh/xmfan/304/orig 2025-12-04T09:17:11.0228881Z * [new branch] gh/xmfan/309/base -> origin/gh/xmfan/309/base 2025-12-04T09:17:11.0230978Z * [new branch] gh/xmfan/309/head -> origin/gh/xmfan/309/head 2025-12-04T09:17:11.0233084Z * [new branch] gh/xmfan/309/orig -> origin/gh/xmfan/309/orig 2025-12-04T09:17:11.0235935Z * [new branch] gh/xmfan/310/base -> origin/gh/xmfan/310/base 2025-12-04T09:17:11.0238004Z * [new branch] gh/xmfan/310/head -> origin/gh/xmfan/310/head 2025-12-04T09:17:11.0240162Z * [new branch] gh/xmfan/310/orig -> origin/gh/xmfan/310/orig 2025-12-04T09:17:11.0242937Z * [new branch] gh/xmfan/311/base -> origin/gh/xmfan/311/base 2025-12-04T09:17:11.0245044Z * [new branch] gh/xmfan/311/head -> origin/gh/xmfan/311/head 2025-12-04T09:17:11.0247275Z * [new branch] gh/xmfan/311/orig -> origin/gh/xmfan/311/orig 2025-12-04T09:17:11.0250228Z * [new branch] gh/xmfan/312/base -> origin/gh/xmfan/312/base 2025-12-04T09:17:11.0252360Z * [new branch] gh/xmfan/312/head -> origin/gh/xmfan/312/head 2025-12-04T09:17:11.0254511Z * [new branch] gh/xmfan/312/orig -> origin/gh/xmfan/312/orig 2025-12-04T09:17:11.0257463Z * [new branch] gh/xmfan/313/base -> origin/gh/xmfan/313/base 2025-12-04T09:17:11.0259599Z * [new branch] gh/xmfan/313/head -> origin/gh/xmfan/313/head 2025-12-04T09:17:11.0261761Z * [new branch] gh/xmfan/313/orig -> origin/gh/xmfan/313/orig 2025-12-04T09:17:11.0265398Z * [new branch] gh/xuanzhang816/27/base -> origin/gh/xuanzhang816/27/base 2025-12-04T09:17:11.0267556Z * [new branch] gh/xuanzhang816/27/head -> origin/gh/xuanzhang816/27/head 2025-12-04T09:17:11.0269681Z * [new branch] gh/xuanzhang816/27/orig -> origin/gh/xuanzhang816/27/orig 2025-12-04T09:17:11.0272634Z * [new branch] gh/xuanzhang816/32/base -> origin/gh/xuanzhang816/32/base 2025-12-04T09:17:11.0274769Z * [new branch] gh/xuanzhang816/32/head -> origin/gh/xuanzhang816/32/head 2025-12-04T09:17:11.0276939Z * [new branch] gh/xuanzhang816/32/orig -> origin/gh/xuanzhang816/32/orig 2025-12-04T09:17:11.0279865Z * [new branch] gh/xuanzhang816/33/base -> origin/gh/xuanzhang816/33/base 2025-12-04T09:17:11.0281974Z * [new branch] gh/xuanzhang816/33/head -> origin/gh/xuanzhang816/33/head 2025-12-04T09:17:11.0284100Z * [new branch] gh/xuanzhang816/33/orig -> origin/gh/xuanzhang816/33/orig 2025-12-04T09:17:11.0287275Z * [new branch] gh/xuanzhang816/34/base -> origin/gh/xuanzhang816/34/base 2025-12-04T09:17:11.0289437Z * [new branch] gh/xuanzhang816/34/head -> origin/gh/xuanzhang816/34/head 2025-12-04T09:17:11.0291649Z * [new branch] gh/xuanzhang816/34/orig -> origin/gh/xuanzhang816/34/orig 2025-12-04T09:17:11.0294684Z * [new branch] gh/xuanzhang816/35/base -> origin/gh/xuanzhang816/35/base 2025-12-04T09:17:11.0296794Z * [new branch] gh/xuanzhang816/35/head -> origin/gh/xuanzhang816/35/head 2025-12-04T09:17:11.0299056Z * [new branch] gh/xuanzhang816/35/orig -> origin/gh/xuanzhang816/35/orig 2025-12-04T09:17:11.0302443Z * [new branch] gh/yanbing-j/11/base -> origin/gh/yanbing-j/11/base 2025-12-04T09:17:11.0304762Z * [new branch] gh/yanbing-j/11/head -> origin/gh/yanbing-j/11/head 2025-12-04T09:17:11.0307028Z * [new branch] gh/yanbing-j/11/orig -> origin/gh/yanbing-j/11/orig 2025-12-04T09:17:11.0309837Z * [new branch] gh/yanbing-j/12/base -> origin/gh/yanbing-j/12/base 2025-12-04T09:17:11.0312005Z * [new branch] gh/yanbing-j/12/head -> origin/gh/yanbing-j/12/head 2025-12-04T09:17:11.0314108Z * [new branch] gh/yanbing-j/12/orig -> origin/gh/yanbing-j/12/orig 2025-12-04T09:17:11.0317543Z * [new branch] gh/yanbing-j/13/base -> origin/gh/yanbing-j/13/base 2025-12-04T09:17:11.0319693Z * [new branch] gh/yanbing-j/13/head -> origin/gh/yanbing-j/13/head 2025-12-04T09:17:11.0321842Z * [new branch] gh/yanbing-j/13/orig -> origin/gh/yanbing-j/13/orig 2025-12-04T09:17:11.0324672Z * [new branch] gh/yanbing-j/14/base -> origin/gh/yanbing-j/14/base 2025-12-04T09:17:11.0327153Z * [new branch] gh/yanbing-j/14/head -> origin/gh/yanbing-j/14/head 2025-12-04T09:17:11.0329306Z * [new branch] gh/yanbing-j/14/orig -> origin/gh/yanbing-j/14/orig 2025-12-04T09:17:11.0332011Z * [new branch] gh/yanbing-j/15/base -> origin/gh/yanbing-j/15/base 2025-12-04T09:17:11.0334136Z * [new branch] gh/yanbing-j/15/head -> origin/gh/yanbing-j/15/head 2025-12-04T09:17:11.0336921Z * [new branch] gh/yanbing-j/15/orig -> origin/gh/yanbing-j/15/orig 2025-12-04T09:17:11.0339695Z * [new branch] gh/yanbing-j/18/base -> origin/gh/yanbing-j/18/base 2025-12-04T09:17:11.0341834Z * [new branch] gh/yanbing-j/18/head -> origin/gh/yanbing-j/18/head 2025-12-04T09:17:11.0344130Z * [new branch] gh/yanbing-j/18/orig -> origin/gh/yanbing-j/18/orig 2025-12-04T09:17:11.0347040Z * [new branch] gh/yanbing-j/19/base -> origin/gh/yanbing-j/19/base 2025-12-04T09:17:11.0349200Z * [new branch] gh/yanbing-j/19/head -> origin/gh/yanbing-j/19/head 2025-12-04T09:17:11.0351345Z * [new branch] gh/yanbing-j/19/orig -> origin/gh/yanbing-j/19/orig 2025-12-04T09:17:11.0354132Z * [new branch] gh/yanbing-j/20/base -> origin/gh/yanbing-j/20/base 2025-12-04T09:17:11.0356322Z * [new branch] gh/yanbing-j/20/head -> origin/gh/yanbing-j/20/head 2025-12-04T09:17:11.0358446Z * [new branch] gh/yanbing-j/20/orig -> origin/gh/yanbing-j/20/orig 2025-12-04T09:17:11.0361248Z * [new branch] gh/yanbing-j/21/base -> origin/gh/yanbing-j/21/base 2025-12-04T09:17:11.0363395Z * [new branch] gh/yanbing-j/21/head -> origin/gh/yanbing-j/21/head 2025-12-04T09:17:11.0366289Z * [new branch] gh/yanbing-j/22/base -> origin/gh/yanbing-j/22/base 2025-12-04T09:17:11.0368699Z * [new branch] gh/yanbing-j/22/head -> origin/gh/yanbing-j/22/head 2025-12-04T09:17:11.0370902Z * [new branch] gh/yanbing-j/22/orig -> origin/gh/yanbing-j/22/orig 2025-12-04T09:17:11.0373752Z * [new branch] gh/yanbing-j/23/base -> origin/gh/yanbing-j/23/base 2025-12-04T09:17:11.0375911Z * [new branch] gh/yanbing-j/23/head -> origin/gh/yanbing-j/23/head 2025-12-04T09:17:11.0378548Z * [new branch] gh/yanbing-j/23/orig -> origin/gh/yanbing-j/23/orig 2025-12-04T09:17:11.0381940Z * [new branch] gh/yanbing-j/24/base -> origin/gh/yanbing-j/24/base 2025-12-04T09:17:11.0384232Z * [new branch] gh/yanbing-j/24/head -> origin/gh/yanbing-j/24/head 2025-12-04T09:17:11.0386561Z * [new branch] gh/yanbing-j/24/orig -> origin/gh/yanbing-j/24/orig 2025-12-04T09:17:11.0389817Z * [new branch] gh/yanbing-j/25/base -> origin/gh/yanbing-j/25/base 2025-12-04T09:17:11.0392030Z * [new branch] gh/yanbing-j/25/head -> origin/gh/yanbing-j/25/head 2025-12-04T09:17:11.0394162Z * [new branch] gh/yanbing-j/25/orig -> origin/gh/yanbing-j/25/orig 2025-12-04T09:17:11.0397496Z * [new branch] gh/yanbing-j/26/base -> origin/gh/yanbing-j/26/base 2025-12-04T09:17:11.0399627Z * [new branch] gh/yanbing-j/26/head -> origin/gh/yanbing-j/26/head 2025-12-04T09:17:11.0401787Z * [new branch] gh/yanbing-j/26/orig -> origin/gh/yanbing-j/26/orig 2025-12-04T09:17:11.0405354Z * [new branch] gh/yang-yu-hang/1/base -> origin/gh/yang-yu-hang/1/base 2025-12-04T09:17:11.0407699Z * [new branch] gh/yang-yu-hang/1/head -> origin/gh/yang-yu-hang/1/head 2025-12-04T09:17:11.0409973Z * [new branch] gh/yang-yu-hang/1/orig -> origin/gh/yang-yu-hang/1/orig 2025-12-04T09:17:11.0412841Z * [new branch] gh/yang-yu-hang/2/base -> origin/gh/yang-yu-hang/2/base 2025-12-04T09:17:11.0415255Z * [new branch] gh/yang-yu-hang/2/head -> origin/gh/yang-yu-hang/2/head 2025-12-04T09:17:11.0417619Z * [new branch] gh/yang-yu-hang/2/orig -> origin/gh/yang-yu-hang/2/orig 2025-12-04T09:17:11.0420478Z * [new branch] gh/yang-yu-hang/3/base -> origin/gh/yang-yu-hang/3/base 2025-12-04T09:17:11.0422789Z * [new branch] gh/yang-yu-hang/3/head -> origin/gh/yang-yu-hang/3/head 2025-12-04T09:17:11.0425433Z * [new branch] gh/yang-yu-hang/3/orig -> origin/gh/yang-yu-hang/3/orig 2025-12-04T09:17:11.0430568Z * [new branch] gh/yangw-dev/12/base -> origin/gh/yangw-dev/12/base 2025-12-04T09:17:11.0433181Z * [new branch] gh/yangw-dev/12/head -> origin/gh/yangw-dev/12/head 2025-12-04T09:17:11.0435296Z * [new branch] gh/yangw-dev/12/orig -> origin/gh/yangw-dev/12/orig 2025-12-04T09:17:11.0438186Z * [new branch] gh/yangw-dev/13/base -> origin/gh/yangw-dev/13/base 2025-12-04T09:17:11.0440401Z * [new branch] gh/yangw-dev/13/head -> origin/gh/yangw-dev/13/head 2025-12-04T09:17:11.0442564Z * [new branch] gh/yangw-dev/13/orig -> origin/gh/yangw-dev/13/orig 2025-12-04T09:17:11.0445355Z * [new branch] gh/yangw-dev/14/base -> origin/gh/yangw-dev/14/base 2025-12-04T09:17:11.0447526Z * [new branch] gh/yangw-dev/14/head -> origin/gh/yangw-dev/14/head 2025-12-04T09:17:11.0449643Z * [new branch] gh/yangw-dev/14/orig -> origin/gh/yangw-dev/14/orig 2025-12-04T09:17:11.0452556Z * [new branch] gh/yangw-dev/15/base -> origin/gh/yangw-dev/15/base 2025-12-04T09:17:11.0454753Z * [new branch] gh/yangw-dev/15/head -> origin/gh/yangw-dev/15/head 2025-12-04T09:17:11.0456876Z * [new branch] gh/yangw-dev/15/orig -> origin/gh/yangw-dev/15/orig 2025-12-04T09:17:11.0459737Z * [new branch] gh/yangw-dev/19/base -> origin/gh/yangw-dev/19/base 2025-12-04T09:17:11.0462007Z * [new branch] gh/yangw-dev/19/head -> origin/gh/yangw-dev/19/head 2025-12-04T09:17:11.0464271Z * [new branch] gh/yangw-dev/19/orig -> origin/gh/yangw-dev/19/orig 2025-12-04T09:17:11.0467188Z * [new branch] gh/yangw-dev/26/base -> origin/gh/yangw-dev/26/base 2025-12-04T09:17:11.0469321Z * [new branch] gh/yangw-dev/26/head -> origin/gh/yangw-dev/26/head 2025-12-04T09:17:11.0471488Z * [new branch] gh/yangw-dev/26/orig -> origin/gh/yangw-dev/26/orig 2025-12-04T09:17:11.0474266Z * [new branch] gh/yangw-dev/27/base -> origin/gh/yangw-dev/27/base 2025-12-04T09:17:11.0476631Z * [new branch] gh/yangw-dev/27/head -> origin/gh/yangw-dev/27/head 2025-12-04T09:17:11.0478602Z * [new branch] gh/yangw-dev/27/orig -> origin/gh/yangw-dev/27/orig 2025-12-04T09:17:11.0482141Z * [new branch] gh/ydwu4/292/base -> origin/gh/ydwu4/292/base 2025-12-04T09:17:11.0484201Z * [new branch] gh/ydwu4/292/head -> origin/gh/ydwu4/292/head 2025-12-04T09:17:11.0486330Z * [new branch] gh/ydwu4/292/orig -> origin/gh/ydwu4/292/orig 2025-12-04T09:17:11.0489202Z * [new branch] gh/ydwu4/294/base -> origin/gh/ydwu4/294/base 2025-12-04T09:17:11.0491337Z * [new branch] gh/ydwu4/294/head -> origin/gh/ydwu4/294/head 2025-12-04T09:17:11.0493489Z * [new branch] gh/ydwu4/294/orig -> origin/gh/ydwu4/294/orig 2025-12-04T09:17:11.0496511Z * [new branch] gh/ydwu4/295/base -> origin/gh/ydwu4/295/base 2025-12-04T09:17:11.0498671Z * [new branch] gh/ydwu4/295/head -> origin/gh/ydwu4/295/head 2025-12-04T09:17:11.0500812Z * [new branch] gh/ydwu4/295/orig -> origin/gh/ydwu4/295/orig 2025-12-04T09:17:11.0503790Z * [new branch] gh/ydwu4/296/base -> origin/gh/ydwu4/296/base 2025-12-04T09:17:11.0505936Z * [new branch] gh/ydwu4/296/head -> origin/gh/ydwu4/296/head 2025-12-04T09:17:11.0508079Z * [new branch] gh/ydwu4/296/orig -> origin/gh/ydwu4/296/orig 2025-12-04T09:17:11.0511154Z * [new branch] gh/ydwu4/306/base -> origin/gh/ydwu4/306/base 2025-12-04T09:17:11.0513304Z * [new branch] gh/ydwu4/306/head -> origin/gh/ydwu4/306/head 2025-12-04T09:17:11.0515480Z * [new branch] gh/ydwu4/306/orig -> origin/gh/ydwu4/306/orig 2025-12-04T09:17:11.0518319Z * [new branch] gh/ydwu4/312/base -> origin/gh/ydwu4/312/base 2025-12-04T09:17:11.0520414Z * [new branch] gh/ydwu4/312/head -> origin/gh/ydwu4/312/head 2025-12-04T09:17:11.0522527Z * [new branch] gh/ydwu4/312/orig -> origin/gh/ydwu4/312/orig 2025-12-04T09:17:11.0525360Z * [new branch] gh/ydwu4/322/base -> origin/gh/ydwu4/322/base 2025-12-04T09:17:11.0527857Z * [new branch] gh/ydwu4/322/head -> origin/gh/ydwu4/322/head 2025-12-04T09:17:11.0529919Z * [new branch] gh/ydwu4/322/orig -> origin/gh/ydwu4/322/orig 2025-12-04T09:17:11.0532831Z * [new branch] gh/ydwu4/327/base -> origin/gh/ydwu4/327/base 2025-12-04T09:17:11.0535022Z * [new branch] gh/ydwu4/327/head -> origin/gh/ydwu4/327/head 2025-12-04T09:17:11.0537170Z * [new branch] gh/ydwu4/327/orig -> origin/gh/ydwu4/327/orig 2025-12-04T09:17:11.0540197Z * [new branch] gh/ydwu4/328/base -> origin/gh/ydwu4/328/base 2025-12-04T09:17:11.0542276Z * [new branch] gh/ydwu4/328/head -> origin/gh/ydwu4/328/head 2025-12-04T09:17:11.0544572Z * [new branch] gh/ydwu4/328/orig -> origin/gh/ydwu4/328/orig 2025-12-04T09:17:11.0547769Z * [new branch] gh/ydwu4/329/base -> origin/gh/ydwu4/329/base 2025-12-04T09:17:11.0549866Z * [new branch] gh/ydwu4/329/head -> origin/gh/ydwu4/329/head 2025-12-04T09:17:11.0551992Z * [new branch] gh/ydwu4/329/orig -> origin/gh/ydwu4/329/orig 2025-12-04T09:17:11.0554885Z * [new branch] gh/ydwu4/330/base -> origin/gh/ydwu4/330/base 2025-12-04T09:17:11.0556986Z * [new branch] gh/ydwu4/330/head -> origin/gh/ydwu4/330/head 2025-12-04T09:17:11.0559222Z * [new branch] gh/ydwu4/330/orig -> origin/gh/ydwu4/330/orig 2025-12-04T09:17:11.0561952Z * [new branch] gh/ydwu4/331/base -> origin/gh/ydwu4/331/base 2025-12-04T09:17:11.0564261Z * [new branch] gh/ydwu4/331/head -> origin/gh/ydwu4/331/head 2025-12-04T09:17:11.0566261Z * [new branch] gh/ydwu4/331/orig -> origin/gh/ydwu4/331/orig 2025-12-04T09:17:11.0569060Z * [new branch] gh/ydwu4/332/base -> origin/gh/ydwu4/332/base 2025-12-04T09:17:11.0571204Z * [new branch] gh/ydwu4/332/head -> origin/gh/ydwu4/332/head 2025-12-04T09:17:11.0573314Z * [new branch] gh/ydwu4/332/orig -> origin/gh/ydwu4/332/orig 2025-12-04T09:17:11.0576022Z * [new branch] gh/ydwu4/333/base -> origin/gh/ydwu4/333/base 2025-12-04T09:17:11.0578172Z * [new branch] gh/ydwu4/333/head -> origin/gh/ydwu4/333/head 2025-12-04T09:17:11.0580316Z * [new branch] gh/ydwu4/333/orig -> origin/gh/ydwu4/333/orig 2025-12-04T09:17:11.0583179Z * [new branch] gh/ydwu4/334/base -> origin/gh/ydwu4/334/base 2025-12-04T09:17:11.0585375Z * [new branch] gh/ydwu4/334/head -> origin/gh/ydwu4/334/head 2025-12-04T09:17:11.0587530Z * [new branch] gh/ydwu4/334/orig -> origin/gh/ydwu4/334/orig 2025-12-04T09:17:11.0590245Z * [new branch] gh/ydwu4/335/base -> origin/gh/ydwu4/335/base 2025-12-04T09:17:11.0592415Z * [new branch] gh/ydwu4/335/head -> origin/gh/ydwu4/335/head 2025-12-04T09:17:11.0594623Z * [new branch] gh/ydwu4/335/orig -> origin/gh/ydwu4/335/orig 2025-12-04T09:17:11.0598027Z * [new branch] gh/ydwu4/337/base -> origin/gh/ydwu4/337/base 2025-12-04T09:17:11.0600148Z * [new branch] gh/ydwu4/337/head -> origin/gh/ydwu4/337/head 2025-12-04T09:17:11.0602332Z * [new branch] gh/ydwu4/337/orig -> origin/gh/ydwu4/337/orig 2025-12-04T09:17:11.0605260Z * [new branch] gh/ydwu4/339/base -> origin/gh/ydwu4/339/base 2025-12-04T09:17:11.0607441Z * [new branch] gh/ydwu4/339/head -> origin/gh/ydwu4/339/head 2025-12-04T09:17:11.0609658Z * [new branch] gh/ydwu4/339/orig -> origin/gh/ydwu4/339/orig 2025-12-04T09:17:11.0613090Z * [new branch] gh/yf225/133/base -> origin/gh/yf225/133/base 2025-12-04T09:17:11.0615215Z * [new branch] gh/yf225/133/head -> origin/gh/yf225/133/head 2025-12-04T09:17:11.0618576Z * [new branch] gh/yf225/93/base -> origin/gh/yf225/93/base 2025-12-04T09:17:11.0620690Z * [new branch] gh/yf225/93/head -> origin/gh/yf225/93/head 2025-12-04T09:17:11.0625258Z * [new branch] gh/yifuwang/152/base -> origin/gh/yifuwang/152/base 2025-12-04T09:17:11.0627798Z * [new branch] gh/yifuwang/152/head -> origin/gh/yifuwang/152/head 2025-12-04T09:17:11.0630067Z * [new branch] gh/yifuwang/152/orig -> origin/gh/yifuwang/152/orig 2025-12-04T09:17:11.0632879Z * [new branch] gh/yifuwang/195/base -> origin/gh/yifuwang/195/base 2025-12-04T09:17:11.0635052Z * [new branch] gh/yifuwang/195/head -> origin/gh/yifuwang/195/head 2025-12-04T09:17:11.0637222Z * [new branch] gh/yifuwang/195/orig -> origin/gh/yifuwang/195/orig 2025-12-04T09:17:11.0640633Z * [new branch] gh/yiming0416/1/base -> origin/gh/yiming0416/1/base 2025-12-04T09:17:11.0642823Z * [new branch] gh/yiming0416/1/head -> origin/gh/yiming0416/1/head 2025-12-04T09:17:11.0645543Z * [new branch] gh/yiming0416/2/base -> origin/gh/yiming0416/2/base 2025-12-04T09:17:11.0647624Z * [new branch] gh/yiming0416/2/head -> origin/gh/yiming0416/2/head 2025-12-04T09:17:11.0651188Z * [new branch] gh/yushangdi/1/base -> origin/gh/yushangdi/1/base 2025-12-04T09:17:11.0653526Z * [new branch] gh/yushangdi/1/head -> origin/gh/yushangdi/1/head 2025-12-04T09:17:11.0656329Z * [new branch] gh/yushangdi/10/base -> origin/gh/yushangdi/10/base 2025-12-04T09:17:11.0658429Z * [new branch] gh/yushangdi/10/head -> origin/gh/yushangdi/10/head 2025-12-04T09:17:11.0660601Z * [new branch] gh/yushangdi/10/orig -> origin/gh/yushangdi/10/orig 2025-12-04T09:17:11.0663558Z * [new branch] gh/yushangdi/11/base -> origin/gh/yushangdi/11/base 2025-12-04T09:17:11.0665713Z * [new branch] gh/yushangdi/11/head -> origin/gh/yushangdi/11/head 2025-12-04T09:17:11.0667880Z * [new branch] gh/yushangdi/11/orig -> origin/gh/yushangdi/11/orig 2025-12-04T09:17:11.0670538Z * [new branch] gh/yushangdi/2/base -> origin/gh/yushangdi/2/base 2025-12-04T09:17:11.0672698Z * [new branch] gh/yushangdi/2/head -> origin/gh/yushangdi/2/head 2025-12-04T09:17:11.0675605Z * [new branch] gh/yushangdi/7/base -> origin/gh/yushangdi/7/base 2025-12-04T09:17:11.0677726Z * [new branch] gh/yushangdi/7/head -> origin/gh/yushangdi/7/head 2025-12-04T09:17:11.0679978Z * [new branch] gh/yushangdi/7/orig -> origin/gh/yushangdi/7/orig 2025-12-04T09:17:11.0683097Z * [new branch] gh/yushangdi/8/base -> origin/gh/yushangdi/8/base 2025-12-04T09:17:11.0685449Z * [new branch] gh/yushangdi/8/head -> origin/gh/yushangdi/8/head 2025-12-04T09:17:11.0687615Z * [new branch] gh/yushangdi/8/orig -> origin/gh/yushangdi/8/orig 2025-12-04T09:17:11.0690318Z * [new branch] gh/yushangdi/9/base -> origin/gh/yushangdi/9/base 2025-12-04T09:17:11.0692999Z * [new branch] gh/yushangdi/9/head -> origin/gh/yushangdi/9/head 2025-12-04T09:17:11.0695222Z * [new branch] gh/yushangdi/9/orig -> origin/gh/yushangdi/9/orig 2025-12-04T09:17:11.0698618Z * [new branch] gh/zklaus/19/base -> origin/gh/zklaus/19/base 2025-12-04T09:17:11.0700761Z * [new branch] gh/zklaus/19/head -> origin/gh/zklaus/19/head 2025-12-04T09:17:11.0702969Z * [new branch] gh/zklaus/19/orig -> origin/gh/zklaus/19/orig 2025-12-04T09:17:11.0706408Z * [new branch] gh/zklaus/20/base -> origin/gh/zklaus/20/base 2025-12-04T09:17:11.0708632Z * [new branch] gh/zklaus/20/head -> origin/gh/zklaus/20/head 2025-12-04T09:17:11.0710787Z * [new branch] gh/zklaus/20/orig -> origin/gh/zklaus/20/orig 2025-12-04T09:17:11.0713597Z * [new branch] gh/zklaus/21/base -> origin/gh/zklaus/21/base 2025-12-04T09:17:11.0715762Z * [new branch] gh/zklaus/21/head -> origin/gh/zklaus/21/head 2025-12-04T09:17:11.0717873Z * [new branch] gh/zklaus/21/orig -> origin/gh/zklaus/21/orig 2025-12-04T09:17:11.0720670Z * [new branch] gh/zklaus/22/base -> origin/gh/zklaus/22/base 2025-12-04T09:17:11.0722748Z * [new branch] gh/zklaus/22/head -> origin/gh/zklaus/22/head 2025-12-04T09:17:11.0725303Z * [new branch] gh/zklaus/22/orig -> origin/gh/zklaus/22/orig 2025-12-04T09:17:11.0728138Z * [new branch] gh/zklaus/23/base -> origin/gh/zklaus/23/base 2025-12-04T09:17:11.0730243Z * [new branch] gh/zklaus/23/head -> origin/gh/zklaus/23/head 2025-12-04T09:17:11.0732370Z * [new branch] gh/zklaus/23/orig -> origin/gh/zklaus/23/orig 2025-12-04T09:17:11.0735211Z * [new branch] gh/zklaus/24/base -> origin/gh/zklaus/24/base 2025-12-04T09:17:11.0737535Z * [new branch] gh/zklaus/24/head -> origin/gh/zklaus/24/head 2025-12-04T09:17:11.0739681Z * [new branch] gh/zklaus/24/orig -> origin/gh/zklaus/24/orig 2025-12-04T09:17:11.0743568Z * [new branch] gh/zou3519/1197/base -> origin/gh/zou3519/1197/base 2025-12-04T09:17:11.0745525Z * [new branch] gh/zou3519/1197/head -> origin/gh/zou3519/1197/head 2025-12-04T09:17:11.0747680Z * [new branch] gh/zou3519/1197/orig -> origin/gh/zou3519/1197/orig 2025-12-04T09:17:11.0750884Z * [new branch] gh/zou3519/1199/base -> origin/gh/zou3519/1199/base 2025-12-04T09:17:11.0753241Z * [new branch] gh/zou3519/1199/head -> origin/gh/zou3519/1199/head 2025-12-04T09:17:11.0755414Z * [new branch] gh/zou3519/1199/orig -> origin/gh/zou3519/1199/orig 2025-12-04T09:17:11.0758420Z * [new branch] gh/zou3519/1200/base -> origin/gh/zou3519/1200/base 2025-12-04T09:17:11.0760547Z * [new branch] gh/zou3519/1200/head -> origin/gh/zou3519/1200/head 2025-12-04T09:17:11.0762693Z * [new branch] gh/zou3519/1200/orig -> origin/gh/zou3519/1200/orig 2025-12-04T09:17:11.0765679Z * [new branch] gh/zou3519/1201/base -> origin/gh/zou3519/1201/base 2025-12-04T09:17:11.0767735Z * [new branch] gh/zou3519/1201/head -> origin/gh/zou3519/1201/head 2025-12-04T09:17:11.0769874Z * [new branch] gh/zou3519/1201/orig -> origin/gh/zou3519/1201/orig 2025-12-04T09:17:11.0772552Z * [new branch] gh/zou3519/1202/base -> origin/gh/zou3519/1202/base 2025-12-04T09:17:11.0774711Z * [new branch] gh/zou3519/1202/head -> origin/gh/zou3519/1202/head 2025-12-04T09:17:11.0776890Z * [new branch] gh/zou3519/1202/orig -> origin/gh/zou3519/1202/orig 2025-12-04T09:17:11.0780324Z * [new branch] gh/zpcore/1/base -> origin/gh/zpcore/1/base 2025-12-04T09:17:11.0782444Z * [new branch] gh/zpcore/1/head -> origin/gh/zpcore/1/head 2025-12-04T09:17:11.0785505Z * [new branch] gh/zpcore/11/base -> origin/gh/zpcore/11/base 2025-12-04T09:17:11.0787757Z * [new branch] gh/zpcore/11/head -> origin/gh/zpcore/11/head 2025-12-04T09:17:11.0789870Z * [new branch] gh/zpcore/11/orig -> origin/gh/zpcore/11/orig 2025-12-04T09:17:11.0793298Z * [new branch] gh/zpcore/12/base -> origin/gh/zpcore/12/base 2025-12-04T09:17:11.0795742Z * [new branch] gh/zpcore/12/head -> origin/gh/zpcore/12/head 2025-12-04T09:17:11.0797729Z * [new branch] gh/zpcore/12/orig -> origin/gh/zpcore/12/orig 2025-12-04T09:17:11.0800664Z * [new branch] gh/zpcore/13/base -> origin/gh/zpcore/13/base 2025-12-04T09:17:11.0802728Z * [new branch] gh/zpcore/13/head -> origin/gh/zpcore/13/head 2025-12-04T09:17:11.0804875Z * [new branch] gh/zpcore/13/orig -> origin/gh/zpcore/13/orig 2025-12-04T09:17:11.0807886Z * [new branch] gh/zpcore/14/base -> origin/gh/zpcore/14/base 2025-12-04T09:17:11.0810053Z * [new branch] gh/zpcore/14/head -> origin/gh/zpcore/14/head 2025-12-04T09:17:11.0812190Z * [new branch] gh/zpcore/14/orig -> origin/gh/zpcore/14/orig 2025-12-04T09:17:11.0815297Z * [new branch] gh/zpcore/15/base -> origin/gh/zpcore/15/base 2025-12-04T09:17:11.0817381Z * [new branch] gh/zpcore/15/head -> origin/gh/zpcore/15/head 2025-12-04T09:17:11.0819554Z * [new branch] gh/zpcore/15/orig -> origin/gh/zpcore/15/orig 2025-12-04T09:17:11.0822580Z * [new branch] gh/zpcore/2/base -> origin/gh/zpcore/2/base 2025-12-04T09:17:11.0825050Z * [new branch] gh/zpcore/2/head -> origin/gh/zpcore/2/head 2025-12-04T09:17:11.0830230Z * [new branch] gh/zpcore/21/base -> origin/gh/zpcore/21/base 2025-12-04T09:17:11.0832628Z * [new branch] gh/zpcore/21/head -> origin/gh/zpcore/21/head 2025-12-04T09:17:11.0834673Z * [new branch] gh/zpcore/21/orig -> origin/gh/zpcore/21/orig 2025-12-04T09:17:11.0837664Z * [new branch] gh/zpcore/22/base -> origin/gh/zpcore/22/base 2025-12-04T09:17:11.0840283Z * [new branch] gh/zpcore/22/head -> origin/gh/zpcore/22/head 2025-12-04T09:17:11.0842495Z * [new branch] gh/zpcore/22/orig -> origin/gh/zpcore/22/orig 2025-12-04T09:17:11.0845448Z * [new branch] gh/zpcore/23/base -> origin/gh/zpcore/23/base 2025-12-04T09:17:11.0847672Z * [new branch] gh/zpcore/23/head -> origin/gh/zpcore/23/head 2025-12-04T09:17:11.0849795Z * [new branch] gh/zpcore/23/orig -> origin/gh/zpcore/23/orig 2025-12-04T09:17:11.0852577Z * [new branch] gh/zpcore/24/base -> origin/gh/zpcore/24/base 2025-12-04T09:17:11.0854887Z * [new branch] gh/zpcore/24/head -> origin/gh/zpcore/24/head 2025-12-04T09:17:11.0857003Z * [new branch] gh/zpcore/24/orig -> origin/gh/zpcore/24/orig 2025-12-04T09:17:11.0860099Z * [new branch] gh/zpcore/25/base -> origin/gh/zpcore/25/base 2025-12-04T09:17:11.0862268Z * [new branch] gh/zpcore/25/head -> origin/gh/zpcore/25/head 2025-12-04T09:17:11.0864640Z * [new branch] gh/zpcore/25/orig -> origin/gh/zpcore/25/orig 2025-12-04T09:17:11.0867578Z * [new branch] gh/zpcore/26/base -> origin/gh/zpcore/26/base 2025-12-04T09:17:11.0869809Z * [new branch] gh/zpcore/26/head -> origin/gh/zpcore/26/head 2025-12-04T09:17:11.0872134Z * [new branch] gh/zpcore/26/orig -> origin/gh/zpcore/26/orig 2025-12-04T09:17:11.0875103Z * [new branch] gh/zpcore/27/base -> origin/gh/zpcore/27/base 2025-12-04T09:17:11.0877217Z * [new branch] gh/zpcore/27/head -> origin/gh/zpcore/27/head 2025-12-04T09:17:11.0879346Z * [new branch] gh/zpcore/27/orig -> origin/gh/zpcore/27/orig 2025-12-04T09:17:11.0882699Z * [new branch] gh/zpcore/28/base -> origin/gh/zpcore/28/base 2025-12-04T09:17:11.0885421Z * [new branch] gh/zpcore/28/head -> origin/gh/zpcore/28/head 2025-12-04T09:17:11.0887567Z * [new branch] gh/zpcore/28/orig -> origin/gh/zpcore/28/orig 2025-12-04T09:17:11.0890267Z * [new branch] gh/zpcore/3/base -> origin/gh/zpcore/3/base 2025-12-04T09:17:11.0892326Z * [new branch] gh/zpcore/3/head -> origin/gh/zpcore/3/head 2025-12-04T09:17:11.0895052Z * [new branch] gh/zpcore/4/base -> origin/gh/zpcore/4/base 2025-12-04T09:17:11.0897211Z * [new branch] gh/zpcore/4/head -> origin/gh/zpcore/4/head 2025-12-04T09:17:11.0899886Z * [new branch] gh/zpcore/5/base -> origin/gh/zpcore/5/base 2025-12-04T09:17:11.0901972Z * [new branch] gh/zpcore/5/head -> origin/gh/zpcore/5/head 2025-12-04T09:17:11.0904875Z * [new branch] gh/zpcore/6/base -> origin/gh/zpcore/6/base 2025-12-04T09:17:11.0906973Z * [new branch] gh/zpcore/6/head -> origin/gh/zpcore/6/head 2025-12-04T09:17:11.0910216Z * [new branch] gh/zpcore/7/base -> origin/gh/zpcore/7/base 2025-12-04T09:17:11.0912401Z * [new branch] gh/zpcore/7/head -> origin/gh/zpcore/7/head 2025-12-04T09:17:11.0915203Z * [new branch] gh/zpcore/8/base -> origin/gh/zpcore/8/base 2025-12-04T09:17:11.0917375Z * [new branch] gh/zpcore/8/head -> origin/gh/zpcore/8/head 2025-12-04T09:17:11.0920015Z * [new branch] google-main -> origin/google-main 2025-12-04T09:17:11.0923090Z * [new branch] guangyey/external_stream -> origin/guangyey/external_stream 2025-12-04T09:17:11.0925137Z * [new branch] guangyey/test_2025 -> origin/guangyey/test_2025 2025-12-04T09:17:11.0928230Z * [new branch] guilhermeleobas/cherry-pick-55d87d9dfd9 -> origin/guilhermeleobas/cherry-pick-55d87d9dfd9 2025-12-04T09:17:11.0930749Z * [new branch] hameerabbasi/complex_tensor_subclass -> origin/hameerabbasi/complex_tensor_subclass 2025-12-04T09:17:11.0932760Z * [new branch] hameerabbasi/fix-ctensor-gradcheck-tests -> origin/hameerabbasi/fix-ctensor-gradcheck-tests 2025-12-04T09:17:11.0934489Z * [new branch] hameerabbasi/gradcheck-allclose -> origin/hameerabbasi/gradcheck-allclose 2025-12-04T09:17:11.0936299Z * [new branch] hc_baseline -> origin/hc_baseline 2025-12-04T09:17:11.0938255Z * [new branch] hhh_rand -> origin/hhh_rand 2025-12-04T09:17:11.0940888Z * [new branch] huba/f1 -> origin/huba/f1 2025-12-04T09:17:11.0943572Z * [new branch] increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test -> origin/increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test 2025-12-04T09:17:11.0945257Z * [new branch] inlining -> origin/inlining 2025-12-04T09:17:11.0947191Z * [new branch] inlining-ezyang -> origin/inlining-ezyang 2025-12-04T09:17:11.0949214Z * [new branch] install-torchao-0.13.0 -> origin/install-torchao-0.13.0 2025-12-04T09:17:11.0951546Z * [new branch] instrument-trunk-pull-linux-with-job-test-filters -> origin/instrument-trunk-pull-linux-with-job-test-filters 2025-12-04T09:17:11.0953155Z * [new branch] invoke-subgraph -> origin/invoke-subgraph 2025-12-04T09:17:11.0955143Z * [new branch] issue#58739 -> origin/issue#58739 2025-12-04T09:17:11.0957234Z * [new branch] jainapurva-patch-1 -> origin/jainapurva-patch-1 2025-12-04T09:17:11.0960186Z * [new branch] jathu/o3 -> origin/jathu/o3 2025-12-04T09:17:11.0961930Z * [new branch] jathu/sve -> origin/jathu/sve 2025-12-04T09:17:11.0965226Z * [new branch] jcaip/test-cusparselt-version-0.6.2 -> origin/jcaip/test-cusparselt-version-0.6.2 2025-12-04T09:17:11.0967020Z * [new branch] jcaip/update-cusparselt-0.6.2 -> origin/jcaip/update-cusparselt-0.6.2 2025-12-04T09:17:11.0969605Z * [new branch] jiannanWang/memorysnapshot_filter -> origin/jiannanWang/memorysnapshot_filter 2025-12-04T09:17:11.0971454Z * [new branch] jiannanWang/profilerstepwarning -> origin/jiannanWang/profilerstepwarning 2025-12-04T09:17:11.0973421Z * [new branch] jithunnair-amd-patch-1 -> origin/jithunnair-amd-patch-1 2025-12-04T09:17:11.0975461Z * [new branch] jithunnair-amd-patch-10 -> origin/jithunnair-amd-patch-10 2025-12-04T09:17:11.0977450Z * [new branch] jithunnair-amd-patch-2 -> origin/jithunnair-amd-patch-2 2025-12-04T09:17:11.0979429Z * [new branch] jithunnair-amd-patch-3 -> origin/jithunnair-amd-patch-3 2025-12-04T09:17:11.0981403Z * [new branch] jithunnair-amd-patch-4 -> origin/jithunnair-amd-patch-4 2025-12-04T09:17:11.0983538Z * [new branch] jithunnair-amd-patch-5 -> origin/jithunnair-amd-patch-5 2025-12-04T09:17:11.0985561Z * [new branch] jithunnair-amd-patch-6 -> origin/jithunnair-amd-patch-6 2025-12-04T09:17:11.0987524Z * [new branch] jithunnair-amd-patch-7 -> origin/jithunnair-amd-patch-7 2025-12-04T09:17:11.0989514Z * [new branch] jithunnair-amd-patch-8 -> origin/jithunnair-amd-patch-8 2025-12-04T09:17:11.0991717Z * [new branch] jithunnair-amd-patch-9 -> origin/jithunnair-amd-patch-9 2025-12-04T09:17:11.0994224Z * [new branch] justinchu/native-qdq -> origin/justinchu/native-qdq 2025-12-04T09:17:11.0996978Z * [new branch] kainan666/xlf_debug -> origin/kainan666/xlf_debug 2025-12-04T09:17:11.0998694Z * [new branch] kainan_test -> origin/kainan_test 2025-12-04T09:17:11.1000677Z * [new branch] larryliu0820-patch-1 -> origin/larryliu0820-patch-1 2025-12-04T09:17:11.1004073Z * [new branch] leslie/test_group_gemm_epilogues -> origin/leslie/test_group_gemm_epilogues 2025-12-04T09:17:11.1006603Z * [new branch] lessw2020/fix_cutlass_cache_error -> origin/lessw2020/fix_cutlass_cache_error 2025-12-04T09:17:11.1009132Z * [new branch] liaoxuan/shm_all_reduce -> origin/liaoxuan/shm_all_reduce 2025-12-04T09:17:11.1011001Z * [new branch] liaoxuan/test_fa_disable_softmax -> origin/liaoxuan/test_fa_disable_softmax 2025-12-04T09:17:11.1012768Z * [new branch] liaoxuan/test_int8_sdpa -> origin/liaoxuan/test_int8_sdpa 2025-12-04T09:17:11.1015019Z * [new branch] llama4-stable -> origin/llama4-stable 2025-12-04T09:17:11.1018189Z * [new branch] lts/release/1.8 -> origin/lts/release/1.8 2025-12-04T09:17:11.1021031Z * [new branch] lucaskabela/#94773 -> origin/lucaskabela/#94773 2025-12-04T09:17:11.1022786Z * [new branch] lucaskabela/fix_164876 -> origin/lucaskabela/fix_164876 2025-12-04T09:17:11.1025013Z * [new branch] lucaskabela/flop_counter -> origin/lucaskabela/flop_counter 2025-12-04T09:17:11.1026674Z * [new branch] lucaskabela/func_under_decomp -> origin/lucaskabela/func_under_decomp 2025-12-04T09:17:11.1028337Z * [new branch] lucaskabela/functional_in_dynamo -> origin/lucaskabela/functional_in_dynamo 2025-12-04T09:17:11.1030145Z * [new branch] lucaskabela/install_params_as_graph_attr -> origin/lucaskabela/install_params_as_graph_attr 2025-12-04T09:17:11.1032667Z * [new branch] lucaskabela/parameters_as_graph_attr -> origin/lucaskabela/parameters_as_graph_attr 2025-12-04T09:17:11.1034946Z * [new branch] lucaskabela/remove_aot_dispatcher_metadata -> origin/lucaskabela/remove_aot_dispatcher_metadata 2025-12-04T09:17:11.1036668Z * [new branch] lucaskabela/rnn_decomp -> origin/lucaskabela/rnn_decomp 2025-12-04T09:17:11.1038506Z * [new branch] lucaskabela/typing_backends -> origin/lucaskabela/typing_backends 2025-12-04T09:17:11.1040355Z * [new branch] lucaskabela/typing_ctx_manager -> origin/lucaskabela/typing_ctx_manager 2025-12-04T09:17:11.1042204Z * [new branch] lucaskabela/typing_nn_module -> origin/lucaskabela/typing_nn_module 2025-12-04T09:17:11.1044009Z * [new branch] lucaskabela/typing_user_defined -> origin/lucaskabela/typing_user_defined 2025-12-04T09:17:11.1045863Z * [new branch] lucaskabela/typing_variables -> origin/lucaskabela/typing_variables 2025-12-04T09:17:11.1047696Z * [new branch] lucaskabela/typing_variables_dicts -> origin/lucaskabela/typing_variables_dicts 2025-12-04T09:17:11.1049642Z * [new branch] lucaskabela/typing_variables_functions -> origin/lucaskabela/typing_variables_functions 2025-12-04T09:17:11.1051524Z * [new branch] lucaskabela/typing_variables_lists -> origin/lucaskabela/typing_variables_lists 2025-12-04T09:17:11.1054043Z * [new branch] lw/torch_box_by_ref -> origin/lw/torch_box_by_ref 2025-12-04T09:17:11.1055999Z * [new branch] main -> origin/main 2025-12-04T09:17:11.1058092Z * [new branch] malfet-patch-1 -> origin/malfet-patch-1 2025-12-04T09:17:11.1060019Z * [new branch] malfet-patch-2 -> origin/malfet-patch-2 2025-12-04T09:17:11.1062370Z * [new branch] malfet-patch-3 -> origin/malfet-patch-3 2025-12-04T09:17:11.1064491Z * [new branch] malfet-patch-4 -> origin/malfet-patch-4 2025-12-04T09:17:11.1066311Z * [new branch] malfet-patch-5 -> origin/malfet-patch-5 2025-12-04T09:17:11.1068203Z * [new branch] malfet-patch-6 -> origin/malfet-patch-6 2025-12-04T09:17:11.1070199Z * [new branch] malfet-patch-7 -> origin/malfet-patch-7 2025-12-04T09:17:11.1072223Z * [new branch] malfet-patch-8 -> origin/malfet-patch-8 2025-12-04T09:17:11.1074697Z * [new branch] malfet/add-3.14-ci -> origin/malfet/add-3.14-ci 2025-12-04T09:17:11.1076677Z * [new branch] malfet/be-do-not-make-typos-in-build-artifacts -> origin/malfet/be-do-not-make-typos-in-build-artifacts 2025-12-04T09:17:11.1078508Z * [new branch] malfet/be-move-more-settings-to-checkout-pytorch -> origin/malfet/be-move-more-settings-to-checkout-pytorch 2025-12-04T09:17:11.1080545Z * [new branch] malfet/be-remove-misisng-neon-headers -> origin/malfet/be-remove-misisng-neon-headers 2025-12-04T09:17:11.1082592Z * [new branch] malfet/mps-implement-col2im -> origin/malfet/mps-implement-col2im 2025-12-04T09:17:11.1085188Z * [new branch] manuel/aoti_metal_shimify-thread_safe -> origin/manuel/aoti_metal_shimify-thread_safe 2025-12-04T09:17:11.1086841Z * [new branch] manuel/inductor_link_openmp -> origin/manuel/inductor_link_openmp 2025-12-04T09:17:11.1089323Z * [new branch] masnesral/metaconda -> origin/masnesral/metaconda 2025-12-04T09:17:11.1091327Z * [new branch] mem_profiler_flaky_fix -> origin/mem_profiler_flaky_fix 2025-12-04T09:17:11.1093219Z * [new branch] mem_profiler_stack_trace -> origin/mem_profiler_stack_trace 2025-12-04T09:17:11.1095176Z * [new branch] memory_profiler_stack -> origin/memory_profiler_stack 2025-12-04T09:17:11.1097108Z * [new branch] metascroy-patch-1 -> origin/metascroy-patch-1 2025-12-04T09:17:11.1099014Z * [new branch] mingw_posix -> origin/mingw_posix 2025-12-04T09:17:11.1101601Z * [new branch] mlazos/S429861-debug -> origin/mlazos/S429861-debug 2025-12-04T09:17:11.1103445Z * [new branch] mlazos/aa -> origin/mlazos/aa 2025-12-04T09:17:11.1105553Z * [new branch] mlazos/acts -> origin/mlazos/acts 2025-12-04T09:17:11.1107403Z * [new branch] mlazos/arg-renames -> origin/mlazos/arg-renames 2025-12-04T09:17:11.1109148Z * [new branch] mlazos/bad-cudagraphs -> origin/mlazos/bad-cudagraphs 2025-12-04T09:17:11.1110970Z * [new branch] mlazos/baseline-graph-breaks -> origin/mlazos/baseline-graph-breaks 2025-12-04T09:17:11.1112695Z * [new branch] mlazos/beta-tensor -> origin/mlazos/beta-tensor 2025-12-04T09:17:11.1114352Z * [new branch] mlazos/buffers -> origin/mlazos/buffers 2025-12-04T09:17:11.1115944Z * [new branch] mlazos/buffers2 -> origin/mlazos/buffers2 2025-12-04T09:17:11.1117990Z * [new branch] mlazos/buffers3 -> origin/mlazos/buffers3 2025-12-04T09:17:11.1120118Z * [new branch] mlazos/bwd -> origin/mlazos/bwd 2025-12-04T09:17:11.1121915Z * [new branch] mlazos/combo-test -> origin/mlazos/combo-test 2025-12-04T09:17:11.1124029Z * [new branch] mlazos/ctx-cleanup -> origin/mlazos/ctx-cleanup 2025-12-04T09:17:11.1125935Z * [new branch] mlazos/cuda-cmd-log -> origin/mlazos/cuda-cmd-log 2025-12-04T09:17:11.1127895Z * [new branch] mlazos/cudagraph-tests -> origin/mlazos/cudagraph-tests 2025-12-04T09:17:11.1129813Z * [new branch] mlazos/cudagraphs-measurement -> origin/mlazos/cudagraphs-measurement 2025-12-04T09:17:11.1131616Z * [new branch] mlazos/cutlass-test -> origin/mlazos/cutlass-test 2025-12-04T09:17:11.1133641Z * [new branch] mlazos/cutlass-topo-bug -> origin/mlazos/cutlass-topo-bug 2025-12-04T09:17:11.1135365Z * [new branch] mlazos/dataclass-proxy -> origin/mlazos/dataclass-proxy 2025-12-04T09:17:11.1137256Z * [new branch] mlazos/dc-attrs -> origin/mlazos/dc-attrs 2025-12-04T09:17:11.1139133Z * [new branch] mlazos/dc-helion -> origin/mlazos/dc-helion 2025-12-04T09:17:11.1140962Z * [new branch] mlazos/dict-fix -> origin/mlazos/dict-fix 2025-12-04T09:17:11.1142816Z * [new branch] mlazos/disable-tf -> origin/mlazos/disable-tf 2025-12-04T09:17:11.1144814Z * [new branch] mlazos/dupe-fix -> origin/mlazos/dupe-fix 2025-12-04T09:17:11.1146639Z * [new branch] mlazos/dyn-batch -> origin/mlazos/dyn-batch 2025-12-04T09:17:11.1148467Z * [new branch] mlazos/evt -> origin/mlazos/evt 2025-12-04T09:17:11.1150373Z * [new branch] mlazos/extract-examples -> origin/mlazos/extract-examples 2025-12-04T09:17:11.1152101Z * [new branch] mlazos/foreach-op -> origin/mlazos/foreach-op 2025-12-04T09:17:11.1153924Z * [new branch] mlazos/fp8 -> origin/mlazos/fp8 2025-12-04T09:17:11.1155739Z * [new branch] mlazos/fp8-bias -> origin/mlazos/fp8-bias 2025-12-04T09:17:11.1157636Z * [new branch] mlazos/fp8-bias-fusion -> origin/mlazos/fp8-bias-fusion 2025-12-04T09:17:11.1159519Z * [new branch] mlazos/fp8-fixes -> origin/mlazos/fp8-fixes 2025-12-04T09:17:11.1161413Z * [new branch] mlazos/freezing -> origin/mlazos/freezing 2025-12-04T09:17:11.1163879Z * [new branch] mlazos/h-comp -> origin/mlazos/h-comp 2025-12-04T09:17:11.1165829Z * [new branch] mlazos/h-comp2 -> origin/mlazos/h-comp2 2025-12-04T09:17:11.1167823Z * [new branch] mlazos/hash-hop -> origin/mlazos/hash-hop 2025-12-04T09:17:11.1169684Z * [new branch] mlazos/hc -> origin/mlazos/hc 2025-12-04T09:17:11.1171597Z * [new branch] mlazos/hc-cycles -> origin/mlazos/hc-cycles 2025-12-04T09:17:11.1173466Z * [new branch] mlazos/hc-fixes -> origin/mlazos/hc-fixes 2025-12-04T09:17:11.1175252Z * [new branch] mlazos/hc-fixes3 -> origin/mlazos/hc-fixes3 2025-12-04T09:17:11.1177287Z * [new branch] mlazos/hc-fixes4 -> origin/mlazos/hc-fixes4 2025-12-04T09:17:11.1179599Z * [new branch] mlazos/hc-hf -> origin/mlazos/hc-hf 2025-12-04T09:17:11.1181437Z * [new branch] mlazos/hc-mut -> origin/mlazos/hc-mut 2025-12-04T09:17:11.1183343Z * [new branch] mlazos/hc10 -> origin/mlazos/hc10 2025-12-04T09:17:11.1185205Z * [new branch] mlazos/hc11 -> origin/mlazos/hc11 2025-12-04T09:17:11.1187060Z * [new branch] mlazos/hc12 -> origin/mlazos/hc12 2025-12-04T09:17:11.1188885Z * [new branch] mlazos/hc13 -> origin/mlazos/hc13 2025-12-04T09:17:11.1190725Z * [new branch] mlazos/hc14 -> origin/mlazos/hc14 2025-12-04T09:17:11.1192530Z * [new branch] mlazos/hc15 -> origin/mlazos/hc15 2025-12-04T09:17:11.1194375Z * [new branch] mlazos/hc2 -> origin/mlazos/hc2 2025-12-04T09:17:11.1196177Z * [new branch] mlazos/hc4 -> origin/mlazos/hc4 2025-12-04T09:17:11.1198095Z * [new branch] mlazos/hc5 -> origin/mlazos/hc5 2025-12-04T09:17:11.1199911Z * [new branch] mlazos/hc6 -> origin/mlazos/hc6 2025-12-04T09:17:11.1201859Z * [new branch] mlazos/hc7 -> origin/mlazos/hc7 2025-12-04T09:17:11.1203712Z * [new branch] mlazos/hc8 -> origin/mlazos/hc8 2025-12-04T09:17:11.1205455Z * [new branch] mlazos/hc9 -> origin/mlazos/hc9 2025-12-04T09:17:11.1207397Z * [new branch] mlazos/hc_baseline2 -> origin/mlazos/hc_baseline2 2025-12-04T09:17:11.1209193Z * [new branch] mlazos/inductor-streams -> origin/mlazos/inductor-streams 2025-12-04T09:17:11.1210811Z * [new branch] mlazos/main -> origin/mlazos/main 2025-12-04T09:17:11.1212701Z * [new branch] mlazos/mcg2 -> origin/mlazos/mcg2 2025-12-04T09:17:11.1214617Z * [new branch] mlazos/meta-guards -> origin/mlazos/meta-guards 2025-12-04T09:17:11.1217301Z * [new branch] mlazos/mlazos/foreach-map-adam -> origin/mlazos/mlazos/foreach-map-adam 2025-12-04T09:17:11.1219061Z * [new branch] mlazos/mlazos/tf-mode-backup -> origin/mlazos/mlazos/tf-mode-backup 2025-12-04T09:17:11.1220830Z * [new branch] mlazos/mod-fix -> origin/mlazos/mod-fix 2025-12-04T09:17:11.1222792Z * [new branch] mlazos/mode-fix -> origin/mlazos/mode-fix 2025-12-04T09:17:11.1224940Z * [new branch] mlazos/offsets -> origin/mlazos/offsets 2025-12-04T09:17:11.1228721Z * [new branch] mlazos/overguarding -> origin/mlazos/overguarding 2025-12-04T09:17:11.1230578Z * [new branch] mlazos/proxy-ctors -> origin/mlazos/proxy-ctors 2025-12-04T09:17:11.1232376Z * [new branch] mlazos/quant-fix -> origin/mlazos/quant-fix 2025-12-04T09:17:11.1234240Z * [new branch] mlazos/resnet-fix -> origin/mlazos/resnet-fix 2025-12-04T09:17:11.1236149Z * [new branch] mlazos/rm-buf-names -> origin/mlazos/rm-buf-names 2025-12-04T09:17:11.1238039Z * [new branch] mlazos/rm-code -> origin/mlazos/rm-code 2025-12-04T09:17:11.1239834Z * [new branch] mlazos/rm-spam -> origin/mlazos/rm-spam 2025-12-04T09:17:11.1241786Z * [new branch] mlazos/rtp -> origin/mlazos/rtp 2025-12-04T09:17:11.1243721Z * [new branch] mlazos/static-idx-dbg -> origin/mlazos/static-idx-dbg 2025-12-04T09:17:11.1245559Z * [new branch] mlazos/static-inputs-log -> origin/mlazos/static-inputs-log 2025-12-04T09:17:11.1247218Z * [new branch] mlazos/stests -> origin/mlazos/stests 2025-12-04T09:17:11.1249182Z * [new branch] mlazos/stream-ops -> origin/mlazos/stream-ops 2025-12-04T09:17:11.1251003Z * [new branch] mlazos/td-fix2 -> origin/mlazos/td-fix2 2025-12-04T09:17:11.1252882Z * [new branch] mlazos/tensor-hasattr2 -> origin/mlazos/tensor-hasattr2 2025-12-04T09:17:11.1254639Z * [new branch] mlazos/test -> origin/mlazos/test 2025-12-04T09:17:11.1256491Z * [new branch] mlazos/tf-mode -> origin/mlazos/tf-mode 2025-12-04T09:17:11.1258463Z * [new branch] mlazos/tf-mode-backup2 -> origin/mlazos/tf-mode-backup2 2025-12-04T09:17:11.1260321Z * [new branch] mlazos/tf-mode-reland -> origin/mlazos/tf-mode-reland 2025-12-04T09:17:11.1262277Z * [new branch] mlazos/tf-mode-reland2 -> origin/mlazos/tf-mode-reland2 2025-12-04T09:17:11.1264318Z * [new branch] mlazos/tf-mode-reland3 -> origin/mlazos/tf-mode-reland3 2025-12-04T09:17:11.1266165Z * [new branch] mlazos/triton-no-epi -> origin/mlazos/triton-no-epi 2025-12-04T09:17:11.1268093Z * [new branch] mlazos/tune-proto -> origin/mlazos/tune-proto 2025-12-04T09:17:11.1269959Z * [new branch] mlazos/tuple-fixes -> origin/mlazos/tuple-fixes 2025-12-04T09:17:11.1271981Z * [new branch] mlazos/tuple-fixes2 -> origin/mlazos/tuple-fixes2 2025-12-04T09:17:11.1273698Z * [new branch] mlazos/tuple-handling -> origin/mlazos/tuple-handling 2025-12-04T09:17:11.1275549Z * [new branch] mlazos/user-stream-base -> origin/mlazos/user-stream-base 2025-12-04T09:17:11.1277418Z * [new branch] mlazos/user-streams -> origin/mlazos/user-streams 2025-12-04T09:17:11.1279304Z * [new branch] mlazos/user-streams-backup -> origin/mlazos/user-streams-backup 2025-12-04T09:17:11.1281175Z * [new branch] mlazos/user-streams-backup2 -> origin/mlazos/user-streams-backup2 2025-12-04T09:17:11.1283011Z * [new branch] mlazos/vary-beta -> origin/mlazos/vary-beta 2025-12-04T09:17:11.1284914Z * [new branch] mlazos/vary-beta2 -> origin/mlazos/vary-beta2 2025-12-04T09:17:11.1286764Z * [new branch] mlazos/weird-perf1 -> origin/mlazos/weird-perf1 2025-12-04T09:17:11.1289223Z * [new branch] mm_out_dtype_compile -> origin/mm_out_dtype_compile 2025-12-04T09:17:11.1291108Z * [new branch] module-shim -> origin/module-shim 2025-12-04T09:17:11.1293120Z * [new branch] move_config -> origin/move_config 2025-12-04T09:17:11.1295641Z * [new branch] msaroufim/reduce -> origin/msaroufim/reduce 2025-12-04T09:17:11.1298125Z * [new branch] mtia/basic-cmake -> origin/mtia/basic-cmake 2025-12-04T09:17:11.1300694Z * [new branch] mwizak/fix-triton-block-shape -> origin/mwizak/fix-triton-block-shape 2025-12-04T09:17:11.1302605Z * [new branch] my_varlen_backup -> origin/my_varlen_backup 2025-12-04T09:17:11.1304614Z * [new branch] nativert_num_outputs -> origin/nativert_num_outputs 2025-12-04T09:17:11.1306477Z * [new branch] new-codegen -> origin/new-codegen 2025-12-04T09:17:11.1308396Z * [new branch] newtest-base -> origin/newtest-base 2025-12-04T09:17:11.1310931Z * [new branch] ngimel/addmm_dtype -> origin/ngimel/addmm_dtype 2025-12-04T09:17:11.1312654Z * [new branch] ngimel/div_inv -> origin/ngimel/div_inv 2025-12-04T09:17:11.1314562Z * [new branch] ngimel/error_index_list -> origin/ngimel/error_index_list 2025-12-04T09:17:11.1316279Z * [new branch] ngimel/gather_grid -> origin/ngimel/gather_grid 2025-12-04T09:17:11.1318018Z * [new branch] ngimel/gather_grid_release -> origin/ngimel/gather_grid_release 2025-12-04T09:17:11.1319655Z * [new branch] ngimel/gg_new -> origin/ngimel/gg_new 2025-12-04T09:17:11.1321376Z * [new branch] ngimel/hostalloc -> origin/ngimel/hostalloc 2025-12-04T09:17:11.1323161Z * [new branch] ngimel/storage_id -> origin/ngimel/storage_id 2025-12-04T09:17:11.1325553Z * [new branch] nightly -> origin/nightly 2025-12-04T09:17:11.1328073Z * [new branch] nikitaved/addmm_1_rowcol_lt_path_check -> origin/nikitaved/addmm_1_rowcol_lt_path_check 2025-12-04T09:17:11.1329763Z * [new branch] nikitaved/addmm_epilogue_fusions_2d_bias -> origin/nikitaved/addmm_epilogue_fusions_2d_bias 2025-12-04T09:17:11.1331514Z * [new branch] nikitaved/addmm_epilogue_fusions_inductor -> origin/nikitaved/addmm_epilogue_fusions_inductor 2025-12-04T09:17:11.1333609Z * [new branch] nikitaved/addmm_epilogue_fusions_scratch -> origin/nikitaved/addmm_epilogue_fusions_scratch 2025-12-04T09:17:11.1335662Z * [new branch] nikitaved/grad_addmm_epilogue_fusions -> origin/nikitaved/grad_addmm_epilogue_fusions 2025-12-04T09:17:11.1337791Z * [new branch] nikitaved/simpler_can_use_32bit_index -> origin/nikitaved/simpler_can_use_32bit_index 2025-12-04T09:17:11.1339820Z * [new branch] nikitaved/test -> origin/nikitaved/test 2025-12-04T09:17:11.1341916Z * [new branch] nmacchioni-perf-test-async-autotune -> origin/nmacchioni-perf-test-async-autotune 2025-12-04T09:17:11.1343924Z * [new branch] no_distributed_log_spew -> origin/no_distributed_log_spew 2025-12-04T09:17:11.1345846Z * [new branch] nofun-hack -> origin/nofun-hack 2025-12-04T09:17:11.1347776Z * [new branch] norm_bench -> origin/norm_bench 2025-12-04T09:17:11.1350816Z * [new branch] nullplay/fuse_matmul -> origin/nullplay/fuse_matmul 2025-12-04T09:17:11.1352804Z * [new branch] nullplay_fuse_matmul -> origin/nullplay_fuse_matmul 2025-12-04T09:17:11.1354750Z * [new branch] optimizer_test -> origin/optimizer_test 2025-12-04T09:17:11.1357898Z * [new branch] orig/release/1.10 -> origin/orig/release/1.10 2025-12-04T09:17:11.1359842Z * [new branch] orig/release/1.11 -> origin/orig/release/1.11 2025-12-04T09:17:11.1361712Z * [new branch] orig/release/1.12 -> origin/orig/release/1.12 2025-12-04T09:17:11.1363668Z * [new branch] orig/release/1.13 -> origin/orig/release/1.13 2025-12-04T09:17:11.1365576Z * [new branch] orig/release/1.6 -> origin/orig/release/1.6 2025-12-04T09:17:11.1367656Z * [new branch] orig/release/1.7 -> origin/orig/release/1.7 2025-12-04T09:17:11.1369550Z * [new branch] orig/release/1.8 -> origin/orig/release/1.8 2025-12-04T09:17:11.1371539Z * [new branch] orig/release/1.9 -> origin/orig/release/1.9 2025-12-04T09:17:11.1373618Z * [new branch] orig/release/2.0 -> origin/orig/release/2.0 2025-12-04T09:17:11.1375478Z * [new branch] orig/release/2.1 -> origin/orig/release/2.1 2025-12-04T09:17:11.1377183Z * [new branch] orig/release/2.2 -> origin/orig/release/2.2 2025-12-04T09:17:11.1378952Z * [new branch] orig/release/2.3 -> origin/orig/release/2.3 2025-12-04T09:17:11.1380753Z * [new branch] orig/release/2.4 -> origin/orig/release/2.4 2025-12-04T09:17:11.1382651Z * [new branch] orig/release/2.5 -> origin/orig/release/2.5 2025-12-04T09:17:11.1384527Z * [new branch] orig/release/2.6 -> origin/orig/release/2.6 2025-12-04T09:17:11.1386797Z * [new branch] orig/release/2.7 -> origin/orig/release/2.7 2025-12-04T09:17:11.1389173Z * [new branch] orig/release/2.8 -> origin/orig/release/2.8 2025-12-04T09:17:11.1391050Z * [new branch] orig/release/2.9 -> origin/orig/release/2.9 2025-12-04T09:17:11.1395173Z * [new branch] origin/gh/fxdawnn/1/base -> origin/origin/gh/fxdawnn/1/base 2025-12-04T09:17:11.1397073Z * [new branch] origin/gh/fxdawnn/1/orig -> origin/origin/gh/fxdawnn/1/orig 2025-12-04T09:17:11.1400094Z * [new branch] origin/gh/zpcore/14/orig -> origin/origin/gh/zpcore/14/orig 2025-12-04T09:17:11.1402242Z * [new branch] oulgen-patch-1 -> origin/oulgen-patch-1 2025-12-04T09:17:11.1404208Z * [new branch] oulgen-patch-2 -> origin/oulgen-patch-2 2025-12-04T09:17:11.1406192Z * [new branch] oulgen-patch-3 -> origin/oulgen-patch-3 2025-12-04T09:17:11.1408514Z * [new branch] oulgen-patch-4 -> origin/oulgen-patch-4 2025-12-04T09:17:11.1410461Z * [new branch] padded-tensor -> origin/padded-tensor 2025-12-04T09:17:11.1412468Z * [new branch] pca2 -> origin/pca2 2025-12-04T09:17:11.1414590Z * [new branch] per_channel_backup -> origin/per_channel_backup 2025-12-04T09:17:11.1416654Z * [new branch] perf_ops -> origin/perf_ops 2025-12-04T09:17:11.1418480Z * [new branch] perf_ops_2_9 -> origin/perf_ops_2_9 2025-12-04T09:17:11.1420545Z * [new branch] pianpwk-patch-1 -> origin/pianpwk-patch-1 2025-12-04T09:17:11.1423154Z * [new branch] pianpwk/__draft_debug_mode -> origin/pianpwk/__draft_debug_mode 2025-12-04T09:17:11.1425331Z * [new branch] pianpwk/_debug_mode_for_triton_draft -> origin/pianpwk/_debug_mode_for_triton_draft 2025-12-04T09:17:11.1427053Z * [new branch] pianpwk/_debug_nn_module_compile -> origin/pianpwk/_debug_nn_module_compile 2025-12-04T09:17:11.1428669Z * [new branch] pianpwk/_draft_triton_11_3 -> origin/pianpwk/_draft_triton_11_3 2025-12-04T09:17:11.1430396Z * [new branch] pianpwk/_manual_bucket_draft -> origin/pianpwk/_manual_bucket_draft 2025-12-04T09:17:11.1432554Z * [new branch] pianpwk/_profile_w_dispatch_keys -> origin/pianpwk/_profile_w_dispatch_keys 2025-12-04T09:17:11.1434721Z * [new branch] pianpwk/_super_draft_debug_mode -> origin/pianpwk/_super_draft_debug_mode 2025-12-04T09:17:11.1436709Z * [new branch] pianpwk/_unbacked_local_shard_size -> origin/pianpwk/_unbacked_local_shard_size 2025-12-04T09:17:11.1438464Z * [new branch] pianpwk/anomaly_tb -> origin/pianpwk/anomaly_tb 2025-12-04T09:17:11.1440343Z * [new branch] pianpwk/auto_fx_annotate -> origin/pianpwk/auto_fx_annotate 2025-12-04T09:17:11.1442209Z * [new branch] pianpwk/backed_size_oblivious_export -> origin/pianpwk/backed_size_oblivious_export 2025-12-04T09:17:11.1444407Z * [new branch] pianpwk/bert_dynamic_perf -> origin/pianpwk/bert_dynamic_perf 2025-12-04T09:17:11.1446376Z * [new branch] pianpwk/debug_fwd_stack_traces -> origin/pianpwk/debug_fwd_stack_traces 2025-12-04T09:17:11.1448333Z * [new branch] pianpwk/debug_hash_tensor -> origin/pianpwk/debug_hash_tensor 2025-12-04T09:17:11.1450183Z * [new branch] pianpwk/debug_mode_annotate -> origin/pianpwk/debug_mode_annotate 2025-12-04T09:17:11.1451984Z * [new branch] pianpwk/debug_mode_defaults -> origin/pianpwk/debug_mode_defaults 2025-12-04T09:17:11.1453792Z * [new branch] pianpwk/debug_mode_hacks -> origin/pianpwk/debug_mode_hacks 2025-12-04T09:17:11.1455785Z * [new branch] pianpwk/debug_mode_opcall_refactor -> origin/pianpwk/debug_mode_opcall_refactor 2025-12-04T09:17:11.1457590Z * [new branch] pianpwk/debug_mode_show_ids -> origin/pianpwk/debug_mode_show_ids 2025-12-04T09:17:11.1459484Z * [new branch] pianpwk/debug_mode_triton -> origin/pianpwk/debug_mode_triton 2025-12-04T09:17:11.1461456Z * [new branch] pianpwk/debug_show_stack_trace -> origin/pianpwk/debug_show_stack_trace 2025-12-04T09:17:11.1470966Z * [new branch] pianpwk/debug_wait_on_collective -> origin/pianpwk/debug_wait_on_collective 2025-12-04T09:17:11.1471355Z * [new branch] pianpwk/debugmode_compile_tf -> origin/pianpwk/debugmode_compile_tf 2025-12-04T09:17:11.1471734Z * [new branch] pianpwk/dispatch_key_debugging_for_debug -> origin/pianpwk/dispatch_key_debugging_for_debug 2025-12-04T09:17:11.1472047Z * [new branch] pianpwk/draft_debug_mode_tfcompile -> origin/pianpwk/draft_debug_mode_tfcompile 2025-12-04T09:17:11.1472314Z * [new branch] pianpwk/draft_multikernel_nn -> origin/pianpwk/draft_multikernel_nn 2025-12-04T09:17:11.1473293Z * [new branch] pianpwk/draft_multikernel_status_10_5 -> origin/pianpwk/draft_multikernel_status_10_5 2025-12-04T09:17:11.1475400Z * [new branch] pianpwk/dtensor_custom_chunk -> origin/pianpwk/dtensor_custom_chunk 2025-12-04T09:17:11.1477362Z * [new branch] pianpwk/dtensor_unbacked_keypath -> origin/pianpwk/dtensor_unbacked_keypath 2025-12-04T09:17:11.1479382Z * [new branch] pianpwk/event_list_tree -> origin/pianpwk/event_list_tree 2025-12-04T09:17:11.1481201Z * [new branch] pianpwk/false_numel_refs -> origin/pianpwk/false_numel_refs 2025-12-04T09:17:11.1482978Z * [new branch] pianpwk/maybe_guard_rel -> origin/pianpwk/maybe_guard_rel 2025-12-04T09:17:11.1484890Z * [new branch] pianpwk/multikernel_hints_draft -> origin/pianpwk/multikernel_hints_draft 2025-12-04T09:17:11.1486911Z * [new branch] pianpwk/no_size_oblivious_slice_scat -> origin/pianpwk/no_size_oblivious_slice_scat 2025-12-04T09:17:11.1488755Z * [new branch] pianpwk/oblivious_reshape_view_better -> origin/pianpwk/oblivious_reshape_view_better 2025-12-04T09:17:11.1490575Z * [new branch] pianpwk/pre_forward_hook -> origin/pianpwk/pre_forward_hook 2025-12-04T09:17:11.1492579Z * [new branch] pianpwk/skip_python_keys_alternate -> origin/pianpwk/skip_python_keys_alternate 2025-12-04T09:17:11.1494364Z * [new branch] pianpwk/skip_python_keys_in_guards -> origin/pianpwk/skip_python_keys_in_guards 2025-12-04T09:17:11.1496099Z * [new branch] pianpwk/sym_tokens_draft -> origin/pianpwk/sym_tokens_draft 2025-12-04T09:17:11.1497953Z * [new branch] pianpwk/symint_one_hot -> origin/pianpwk/symint_one_hot 2025-12-04T09:17:11.1499991Z * [new branch] pianpwk/test_pointwise_guard_or_false -> origin/pianpwk/test_pointwise_guard_or_false 2025-12-04T09:17:11.1501755Z * [new branch] pianpwk/totally_draft_sym_wrap -> origin/pianpwk/totally_draft_sym_wrap 2025-12-04T09:17:11.1503737Z * [new branch] pianpwk/try_dumb_stuff -> origin/pianpwk/try_dumb_stuff 2025-12-04T09:17:11.1505989Z * [new branch] pianpwk/try_dumb_stuff_2 -> origin/pianpwk/try_dumb_stuff_2 2025-12-04T09:17:11.1507496Z * [new branch] pianpwk/unbacked_dtensor_mm -> origin/pianpwk/unbacked_dtensor_mm 2025-12-04T09:17:11.1509266Z * [new branch] pianpwk/unbacked_tracing_12_2 -> origin/pianpwk/unbacked_tracing_12_2 2025-12-04T09:17:11.1511029Z * [new branch] pianpwk/user_symints -> origin/pianpwk/user_symints 2025-12-04T09:17:11.1512858Z * [new branch] pianpwk/wan21_reshape -> origin/pianpwk/wan21_reshape 2025-12-04T09:17:11.1515561Z * [new branch] piz/fix_partial_backward_1112 -> origin/piz/fix_partial_backward_1112 2025-12-04T09:17:11.1517325Z * [new branch] piz/prop_cache_clean -> origin/piz/prop_cache_clean 2025-12-04T09:17:11.1519133Z * [new branch] pool-separate -> origin/pool-separate 2025-12-04T09:17:11.1521064Z * [new branch] pr-156087 -> origin/pr-156087 2025-12-04T09:17:11.1523753Z * [new branch] pr/131860 -> origin/pr/131860 2025-12-04T09:17:11.1526053Z * [new branch] predispatch_to -> origin/predispatch_to 2025-12-04T09:17:11.1528030Z * [new branch] protect-c17 -> origin/protect-c17 2025-12-04T09:17:11.1530035Z * [new branch] pt-opt-cuda3 -> origin/pt-opt-cuda3 2025-12-04T09:17:11.1532477Z * [new branch] python_compiled_autograd -> origin/python_compiled_autograd 2025-12-04T09:17:11.1535940Z * [new branch] q1l1/fix_device_moved_constant_type_unknown -> origin/q1l1/fix_device_moved_constant_type_unknown 2025-12-04T09:17:11.1537422Z * [new branch] q1l1/fix_wrong_default_type_for_kernel_call_args -> origin/q1l1/fix_wrong_default_type_for_kernel_call_args 2025-12-04T09:17:11.1540269Z * [new branch] qchip/export-D54134695 -> origin/qchip/export-D54134695 2025-12-04T09:17:11.1542267Z * [new branch] quote-pytest_cache -> origin/quote-pytest_cache 2025-12-04T09:17:11.1544860Z * [new branch] reland-accgrad-stream-warn -> origin/reland-accgrad-stream-warn 2025-12-04T09:17:11.1547643Z * [new branch] release/1.10 -> origin/release/1.10 2025-12-04T09:17:11.1549298Z * [new branch] release/1.11 -> origin/release/1.11 2025-12-04T09:17:11.1551199Z * [new branch] release/1.12 -> origin/release/1.12 2025-12-04T09:17:11.1553011Z * [new branch] release/1.13 -> origin/release/1.13 2025-12-04T09:17:11.1554912Z * [new branch] release/1.4 -> origin/release/1.4 2025-12-04T09:17:11.1556351Z * [new branch] release/1.4.1 -> origin/release/1.4.1 2025-12-04T09:17:11.1558080Z * [new branch] release/1.5 -> origin/release/1.5 2025-12-04T09:17:11.1559962Z * [new branch] release/1.6 -> origin/release/1.6 2025-12-04T09:17:11.1561875Z * [new branch] release/1.7 -> origin/release/1.7 2025-12-04T09:17:11.1564105Z * [new branch] release/1.8 -> origin/release/1.8 2025-12-04T09:17:11.1565650Z * [new branch] release/1.9 -> origin/release/1.9 2025-12-04T09:17:11.1567508Z * [new branch] release/2.0 -> origin/release/2.0 2025-12-04T09:17:11.1569501Z * [new branch] release/2.1 -> origin/release/2.1 2025-12-04T09:17:11.1571349Z * [new branch] release/2.2 -> origin/release/2.2 2025-12-04T09:17:11.1573525Z * [new branch] release/2.3 -> origin/release/2.3 2025-12-04T09:17:11.1576045Z * [new branch] release/2.4 -> origin/release/2.4 2025-12-04T09:17:11.1578438Z * [new branch] release/2.5 -> origin/release/2.5 2025-12-04T09:17:11.1580681Z * [new branch] release/2.6 -> origin/release/2.6 2025-12-04T09:17:11.1582374Z * [new branch] release/2.7 -> origin/release/2.7 2025-12-04T09:17:11.1584575Z * [new branch] release/2.8 -> origin/release/2.8 2025-12-04T09:17:11.1586591Z * [new branch] release/2.9 -> origin/release/2.9 2025-12-04T09:17:11.1588709Z * [new branch] release_notes -> origin/release_notes 2025-12-04T09:17:11.1590384Z * [new branch] remove_pyinterpreter -> origin/remove_pyinterpreter 2025-12-04T09:17:11.1592570Z * [new branch] replace-pytorch-labs-20250812-195836 -> origin/replace-pytorch-labs-20250812-195836 2025-12-04T09:17:11.1594407Z * [new branch] replace-pytorch-labs-20250812-200248 -> origin/replace-pytorch-labs-20250812-200248 2025-12-04T09:17:11.1596301Z * [new branch] replace-pytorch-labs-20250812-200324 -> origin/replace-pytorch-labs-20250812-200324 2025-12-04T09:17:11.1598064Z * [new branch] replace-pytorch-labs-20250812-204020 -> origin/replace-pytorch-labs-20250812-204020 2025-12-04T09:17:11.1601836Z * [new branch] revert-131069-gh/krzysztofjordan/1/head -> origin/revert-131069-gh/krzysztofjordan/1/head 2025-12-04T09:17:11.1605507Z * [new branch] revert-131469-gh/andrewor14/51/head -> origin/revert-131469-gh/andrewor14/51/head 2025-12-04T09:17:11.1609187Z * [new branch] revert-152361-gh/fadara01/1/head -> origin/revert-152361-gh/fadara01/1/head 2025-12-04T09:17:11.1612752Z * [new branch] revert-156870-gh/skarjala/3/head -> origin/revert-156870-gh/skarjala/3/head 2025-12-04T09:17:11.1614979Z * [new branch] revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ -> origin/revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ 2025-12-04T09:17:11.1616678Z * [new branch] revert-hoo-invoke-subgraph -> origin/revert-hoo-invoke-subgraph 2025-12-04T09:17:11.1618672Z * [new branch] revert_always_build_distributed -> origin/revert_always_build_distributed 2025-12-04T09:17:11.1620631Z * [new branch] rms_norm_patch -> origin/rms_norm_patch 2025-12-04T09:17:11.1623380Z * [new branch] ruisi/fix_all_to_all_estimation -> origin/ruisi/fix_all_to_all_estimation 2025-12-04T09:17:11.1625319Z * [new branch] ruisi/fix_comm_estimation -> origin/ruisi/fix_comm_estimation 2025-12-04T09:17:11.1629090Z * [new branch] ruisi/fix_dynamic_shape_estimation -> origin/ruisi/fix_dynamic_shape_estimation 2025-12-04T09:17:11.1631220Z * [new branch] ruisi/fix_llama3_autobucketing -> origin/ruisi/fix_llama3_autobucketing 2025-12-04T09:17:11.1633273Z * [new branch] ruisi/fix_manual_bucketing_ep_pass -> origin/ruisi/fix_manual_bucketing_ep_pass 2025-12-04T09:17:11.1635475Z * [new branch] ruisi/manual_bucket_pass -> origin/ruisi/manual_bucket_pass 2025-12-04T09:17:11.1638291Z * [new branch] ryanguo99/cleanup-dynamo-expected-failures -> origin/ryanguo99/cleanup-dynamo-expected-failures 2025-12-04T09:17:11.1639819Z * [new branch] ryanguo99/fix-closure-var -> origin/ryanguo99/fix-closure-var 2025-12-04T09:17:11.1642363Z * [new branch] rzou/faketensor_bench -> origin/rzou/faketensor_bench 2025-12-04T09:17:11.1644238Z * [new branch] rzou/njt -> origin/rzou/njt 2025-12-04T09:17:11.1645790Z * [new branch] rzou/pca -> origin/rzou/pca 2025-12-04T09:17:11.1647805Z * [new branch] rzou/realprop -> origin/rzou/realprop 2025-12-04T09:17:11.1649723Z * [new branch] samplevllm -> origin/samplevllm 2025-12-04T09:17:11.1652645Z * [new branch] sanchitintel/weird_thing_with_test_cpu_select_algorithm -> origin/sanchitintel/weird_thing_with_test_cpu_select_algorithm 2025-12-04T09:17:11.1654704Z * [new branch] sapling-pr-archive-SS-JIA -> origin/sapling-pr-archive-SS-JIA 2025-12-04T09:17:11.1657068Z * [new branch] sapling-pr-archive-tushar00jain -> origin/sapling-pr-archive-tushar00jain 2025-12-04T09:17:11.1658878Z * [new branch] save -> origin/save 2025-12-04T09:17:11.1660934Z * [new branch] scaled_mm -> origin/scaled_mm 2025-12-04T09:17:11.1663010Z * [new branch] scan_attempt -> origin/scan_attempt 2025-12-04T09:17:11.1665388Z * [new branch] sdym/2.5.1 -> origin/sdym/2.5.1 2025-12-04T09:17:11.1667273Z * [new branch] sekyondaMeta-dynamoconfig-fix -> origin/sekyondaMeta-dynamoconfig-fix 2025-12-04T09:17:11.1669860Z * [new branch] shengf/fx-xform-perf -> origin/shengf/fx-xform-perf 2025-12-04T09:17:11.1671853Z * [new branch] shoumikhin-patch-1 -> origin/shoumikhin-patch-1 2025-12-04T09:17:11.1673760Z * [new branch] solve-accuracy-fix -> origin/solve-accuracy-fix 2025-12-04T09:17:11.1675592Z * [new branch] some_rocm_inductor_skips -> origin/some_rocm_inductor_skips 2025-12-04T09:17:11.1678408Z * [new branch] soulitzer/stash-tls-ac -> origin/soulitzer/stash-tls-ac 2025-12-04T09:17:11.1680300Z * [new branch] sparse-mm-bf16-support -> origin/sparse-mm-bf16-support 2025-12-04T09:17:11.1682413Z * [new branch] starterTaskUpdate -> origin/starterTaskUpdate 2025-12-04T09:17:11.1684075Z * [new branch] suo -> origin/suo 2025-12-04T09:17:11.1685991Z * [new branch] sve-poc -> origin/sve-poc 2025-12-04T09:17:11.1688017Z * [new branch] switch-bn -> origin/switch-bn 2025-12-04T09:17:11.1690026Z * [new branch] sy_annotation_in_autograd_hop -> origin/sy_annotation_in_autograd_hop 2025-12-04T09:17:11.1691961Z * [new branch] sy_aot_eager_record -> origin/sy_aot_eager_record 2025-12-04T09:17:11.1693882Z * [new branch] sy_custom_bucketing -> origin/sy_custom_bucketing 2025-12-04T09:17:11.1695945Z * [new branch] sy_debug_mode_test -> origin/sy_debug_mode_test 2025-12-04T09:17:11.1698281Z * [new branch] sy_deserialize -> origin/sy_deserialize 2025-12-04T09:17:11.1700140Z * [new branch] sy_dump_gm_code -> origin/sy_dump_gm_code 2025-12-04T09:17:11.1702087Z * [new branch] sy_exp -> origin/sy_exp 2025-12-04T09:17:11.1704175Z * [new branch] sy_export_annotation -> origin/sy_export_annotation 2025-12-04T09:17:11.1706264Z * [new branch] sy_invoke_subgraph -> origin/sy_invoke_subgraph 2025-12-04T09:17:11.1708197Z * [new branch] sy_kernel_bw_name -> origin/sy_kernel_bw_name 2025-12-04T09:17:11.1710082Z * [new branch] sy_multi_arch -> origin/sy_multi_arch 2025-12-04T09:17:11.1712044Z * [new branch] sy_nn_module_stack -> origin/sy_nn_module_stack 2025-12-04T09:17:11.1714219Z * [new branch] sy_original_dtensor -> origin/sy_original_dtensor 2025-12-04T09:17:11.1715926Z * [new branch] sy_profiler_cia -> origin/sy_profiler_cia 2025-12-04T09:17:11.1717917Z * [new branch] symm_mem_sync -> origin/symm_mem_sync 2025-12-04T09:17:11.1719935Z * [new branch] sympy-bottleneck-repro -> origin/sympy-bottleneck-repro 2025-12-04T09:17:11.1721828Z * [new branch] tensordict_integration -> origin/tensordict_integration 2025-12-04T09:17:11.1723920Z * [new branch] test-move-conda-builds -> origin/test-move-conda-builds 2025-12-04T09:17:11.1726388Z * [new branch] test-old -> origin/test-old 2025-12-04T09:17:11.1729236Z * [new branch] test/bmm_heur -> origin/test/bmm_heur 2025-12-04T09:17:11.1731758Z * [new branch] tianren/customOp_autotune_fix -> origin/tianren/customOp_autotune_fix 2025-12-04T09:17:11.1732696Z * [new branch] tianren/customOp_enable_max_autotune -> origin/tianren/customOp_enable_max_autotune 2025-12-04T09:17:11.1734675Z * [new branch] tianren/customOp_fusion -> origin/tianren/customOp_fusion 2025-12-04T09:17:11.1736596Z * [new branch] tianren/customop_collectiveop_benchmark -> origin/tianren/customop_collectiveop_benchmark 2025-12-04T09:17:11.1738665Z * [new branch] tianren/customop_collectiveop_benchmark_fix -> origin/tianren/customop_collectiveop_benchmark_fix 2025-12-04T09:17:11.1740872Z * [new branch] tianren/customop_dynamic_config -> origin/tianren/customop_dynamic_config 2025-12-04T09:17:11.1742806Z * [new branch] tianren/dynamic_range_input -> origin/tianren/dynamic_range_input 2025-12-04T09:17:11.1744768Z * [new branch] tianren/dynamic_range_input_fix -> origin/tianren/dynamic_range_input_fix 2025-12-04T09:17:11.1746554Z * [new branch] tianren/dynamic_range_input_merge -> origin/tianren/dynamic_range_input_merge 2025-12-04T09:17:11.1748314Z * [new branch] tianren/flex_paged_attn_fix_temp -> origin/tianren/flex_paged_attn_fix_temp 2025-12-04T09:17:11.1750244Z * [new branch] tianren/fx_codegen_dump -> origin/tianren/fx_codegen_dump 2025-12-04T09:17:11.1752541Z * [new branch] tianren/symmetric_memory -> origin/tianren/symmetric_memory 2025-12-04T09:17:11.1754424Z * [new branch] tianren/test -> origin/tianren/test 2025-12-04T09:17:11.1756340Z * [new branch] tidy_performance_cyy -> origin/tidy_performance_cyy 2025-12-04T09:17:11.1758254Z * [new branch] tmp -> origin/tmp 2025-12-04T09:17:11.1760243Z * [new branch] torchtitan_ep -> origin/torchtitan_ep 2025-12-04T09:17:11.1762175Z * [new branch] torchtitan_integration -> origin/torchtitan_integration 2025-12-04T09:17:11.1764313Z * [new branch] trace_fsdp_torchtune_lora -> origin/trace_fsdp_torchtune_lora 2025-12-04T09:17:11.1766062Z * [new branch] traceable_fsdp_unit_tests -> origin/traceable_fsdp_unit_tests 2025-12-04T09:17:11.1768073Z * [new branch] tree_loop_vec_base -> origin/tree_loop_vec_base 2025-12-04T09:17:11.1770041Z * [new branch] triton_kernel -> origin/triton_kernel 2025-12-04T09:17:11.1772030Z * [new branch] tt_pkg_1908 -> origin/tt_pkg_1908 2025-12-04T09:17:11.1773942Z * [new branch] type_dec -> origin/type_dec 2025-12-04T09:17:11.1775907Z * [new branch] udate-sphinx-dependancies -> origin/udate-sphinx-dependancies 2025-12-04T09:17:11.1778551Z * [new branch] update-audio-commit-hash/17630256502-1803-1 -> origin/update-audio-commit-hash/17630256502-1803-1 2025-12-04T09:17:11.1780385Z * [new branch] update-audio-commit-hash/19087141161-1916-1 -> origin/update-audio-commit-hash/19087141161-1916-1 2025-12-04T09:17:11.1782196Z * [new branch] update-audio-commit-hash/19250643381-1929-1 -> origin/update-audio-commit-hash/19250643381-1929-1 2025-12-04T09:17:11.1784019Z * [new branch] update-audio-commit-hash/19397724337-1935-1 -> origin/update-audio-commit-hash/19397724337-1935-1 2025-12-04T09:17:11.1785808Z * [new branch] update-audio-commit-hash/19555670148-1941-1 -> origin/update-audio-commit-hash/19555670148-1941-1 2025-12-04T09:17:11.1787828Z * [new branch] update-audio-commit-hash/19750627930-1946-1 -> origin/update-audio-commit-hash/19750627930-1946-1 2025-12-04T09:17:11.1790420Z * [new branch] update-triton-commit-hash/13663274526-1487-2 -> origin/update-triton-commit-hash/13663274526-1487-2 2025-12-04T09:17:11.1792860Z * [new branch] update-vision-commit-hash/19087141161-1916-1 -> origin/update-vision-commit-hash/19087141161-1916-1 2025-12-04T09:17:11.1794739Z * [new branch] update-vision-commit-hash/19184897099-1925-1 -> origin/update-vision-commit-hash/19184897099-1925-1 2025-12-04T09:17:11.1796461Z * [new branch] update-vision-commit-hash/19250643381-1929-1 -> origin/update-vision-commit-hash/19250643381-1929-1 2025-12-04T09:17:11.1798278Z * [new branch] update-vision-commit-hash/19381328640-1934-1 -> origin/update-vision-commit-hash/19381328640-1934-1 2025-12-04T09:17:11.1799999Z * [new branch] update-vision-commit-hash/19485237164-1938-1 -> origin/update-vision-commit-hash/19485237164-1938-1 2025-12-04T09:17:11.1802552Z * [new branch] update-vllm-commit-hash/18451675449-1879-1 -> origin/update-vllm-commit-hash/18451675449-1879-1 2025-12-04T09:17:11.1804470Z * [new branch] update-vllm-dockerfile -> origin/update-vllm-dockerfile 2025-12-04T09:17:11.1807117Z * [new branch] update-xla-commit-hash/19224287370-211-1 -> origin/update-xla-commit-hash/19224287370-211-1 2025-12-04T09:17:11.1808871Z * [new branch] update-xla-commit-hash/19422028566-212-1 -> origin/update-xla-commit-hash/19422028566-212-1 2025-12-04T09:17:11.1810673Z * [new branch] update-xla-commit-hash/19626841311-213-1 -> origin/update-xla-commit-hash/19626841311-213-1 2025-12-04T09:17:11.1812662Z * [new branch] update_docs_torch_multinomial_issue#125388 -> origin/update_docs_torch_multinomial_issue#125388 2025-12-04T09:17:11.1814413Z * [new branch] update_operator_readme -> origin/update_operator_readme 2025-12-04T09:17:11.1816425Z * [new branch] update_slow_tests_1722488736 -> origin/update_slow_tests_1722488736 2025-12-04T09:17:11.1818398Z * [new branch] update_slow_tests_1722879173 -> origin/update_slow_tests_1722879173 2025-12-04T09:17:11.1820327Z * [new branch] update_slow_tests_1762155677 -> origin/update_slow_tests_1762155677 2025-12-04T09:17:11.1822383Z * [new branch] update_slow_tests_1763365283 -> origin/update_slow_tests_1763365283 2025-12-04T09:17:11.1824701Z * [new branch] update_submodule_FBGEMM -> origin/update_submodule_FBGEMM 2025-12-04T09:17:11.1826642Z * [new branch] update_submodule_kineto -> origin/update_submodule_kineto 2025-12-04T09:17:11.1828588Z * [new branch] update_submodule_tensorpipe -> origin/update_submodule_tensorpipe 2025-12-04T09:17:11.1830586Z * [new branch] upload-tests-for-autorevert -> origin/upload-tests-for-autorevert 2025-12-04T09:17:11.1832514Z * [new branch] v0.1.2 -> origin/v0.1.2 2025-12-04T09:17:11.1834567Z * [new branch] v1.0.1 -> origin/v1.0.1 2025-12-04T09:17:11.1836543Z * [new branch] v1.0.3 -> origin/v1.0.3 2025-12-04T09:17:11.1838791Z * [new branch] v1.1.0 -> origin/v1.1.0 2025-12-04T09:17:11.1840821Z * [new branch] v1.2.0 -> origin/v1.2.0 2025-12-04T09:17:11.1842838Z * [new branch] v1.3.0 -> origin/v1.3.0 2025-12-04T09:17:11.1844896Z * [new branch] v1.3.1 -> origin/v1.3.1 2025-12-04T09:17:11.1846945Z * [new branch] validate_fn -> origin/validate_fn 2025-12-04T09:17:11.1849569Z * [new branch] validations_2.6 -> origin/validations_2.6 2025-12-04T09:17:11.1851588Z * [new branch] validations_2.8 -> origin/validations_2.8 2025-12-04T09:17:11.1853524Z * [new branch] varlen-api -> origin/varlen-api 2025-12-04T09:17:11.1855526Z * [new branch] varlen-api-backup -> origin/varlen-api-backup 2025-12-04T09:17:11.1857404Z * [new branch] varlen_batch_invariance -> origin/varlen_batch_invariance 2025-12-04T09:17:11.1860250Z * [new branch] viable/strict -> origin/viable/strict 2025-12-04T09:17:11.1863667Z * [new branch] vishal9-team/dtensor_parallelism_toy -> origin/vishal9-team/dtensor_parallelism_toy 2025-12-04T09:17:11.1865308Z * [new branch] vllmbuildci -> origin/vllmbuildci 2025-12-04T09:17:11.1867316Z * [new branch] vllmpin -> origin/vllmpin 2025-12-04T09:17:11.1869563Z * [new branch] vscode-recommend-pyrefly -> origin/vscode-recommend-pyrefly 2025-12-04T09:17:11.1871373Z * [new branch] wdvr-patch-1 -> origin/wdvr-patch-1 2025-12-04T09:17:11.1873942Z * [new branch] wdvr/iss_145259 -> origin/wdvr/iss_145259 2025-12-04T09:17:11.1876539Z * [new branch] whc/pei -> origin/whc/pei 2025-12-04T09:17:11.1878192Z * [new branch] whc/pp_fix -> origin/whc/pp_fix 2025-12-04T09:17:11.1880021Z * [new branch] whc/sharding -> origin/whc/sharding 2025-12-04T09:17:11.1881803Z * [new branch] whc/sharding2 -> origin/whc/sharding2 2025-12-04T09:17:11.1883575Z * [new branch] whc/uneven -> origin/whc/uneven 2025-12-04T09:17:11.1885697Z * [new branch] whc/uneven-merge -> origin/whc/uneven-merge 2025-12-04T09:17:11.1887650Z * [new branch] win_warnings -> origin/win_warnings 2025-12-04T09:17:11.1889529Z * [new branch] windows_libtorch_free -> origin/windows_libtorch_free 2025-12-04T09:17:11.1891407Z * [new branch] xmfan-war -> origin/xmfan-war 2025-12-04T09:17:11.1894002Z * [new branch] xmfan/ca_0516 -> origin/xmfan/ca_0516 2025-12-04T09:17:11.1896190Z * [new branch] xmfan/ca_1051b93192 -> origin/xmfan/ca_1051b93192 2025-12-04T09:17:11.1898199Z * [new branch] xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 -> origin/xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 2025-12-04T09:17:11.1899487Z * [new branch] xmfan/ca_5a2be192d1 -> origin/xmfan/ca_5a2be192d1 2025-12-04T09:17:11.1901308Z * [new branch] xmfan/ca_9d59b516e9 -> origin/xmfan/ca_9d59b516e9 2025-12-04T09:17:11.1903001Z * [new branch] xmfan/ca_apr8 -> origin/xmfan/ca_apr8 2025-12-04T09:17:11.1904813Z * [new branch] xmfan/ca_base -> origin/xmfan/ca_base 2025-12-04T09:17:11.1906879Z * [new branch] xmfan/ca_dynamic -> origin/xmfan/ca_dynamic 2025-12-04T09:17:11.1909052Z * [new branch] xmfan/ca_fix_dyn -> origin/xmfan/ca_fix_dyn 2025-12-04T09:17:11.1911095Z * [new branch] xmfan/ca_fix_lowering -> origin/xmfan/ca_fix_lowering 2025-12-04T09:17:11.1912913Z * [new branch] xmfan/ca_fix_polyfills -> origin/xmfan/ca_fix_polyfills 2025-12-04T09:17:11.1914659Z * [new branch] xmfan/ca_jan3 -> origin/xmfan/ca_jan3 2025-12-04T09:17:11.1916600Z * [new branch] xmfan/ca_jun18 -> origin/xmfan/ca_jun18 2025-12-04T09:17:11.1918310Z * [new branch] xmfan/ca_jun24 -> origin/xmfan/ca_jun24 2025-12-04T09:17:11.1920117Z * [new branch] xmfan/ca_nested -> origin/xmfan/ca_nested 2025-12-04T09:17:11.1921963Z * [new branch] xmfan/ca_overhead -> origin/xmfan/ca_overhead 2025-12-04T09:17:11.1923807Z * [new branch] xmfan/ca_overhead_0eba7e5451 -> origin/xmfan/ca_overhead_0eba7e5451 2025-12-04T09:17:11.1928201Z * [new branch] xmfan/cacu_jun18 -> origin/xmfan/cacu_jun18 2025-12-04T09:17:11.1929916Z * [new branch] xmfan/cacu_jun19 -> origin/xmfan/cacu_jun19 2025-12-04T09:17:11.1931720Z * [new branch] xmfan/cacu_jun4 -> origin/xmfan/cacu_jun4 2025-12-04T09:17:11.1933628Z * [new branch] xmfan/disable_duck_shape -> origin/xmfan/disable_duck_shape 2025-12-04T09:17:11.1935457Z * [new branch] xmfan/fca_cpp_node_passthrough -> origin/xmfan/fca_cpp_node_passthrough 2025-12-04T09:17:11.1937486Z * [new branch] xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 2025-12-04T09:17:11.1939298Z * [new branch] xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 2025-12-04T09:17:11.1940959Z * [new branch] xmfan/single_step -> origin/xmfan/single_step 2025-12-04T09:17:11.1942949Z * [new branch] xmfan/sth_0829 -> origin/xmfan/sth_0829 2025-12-04T09:17:11.1944979Z * [new branch] xmfan/test -> origin/xmfan/test 2025-12-04T09:17:11.1947489Z * [new branch] yguo/debug-0226-constexpr -> origin/yguo/debug-0226-constexpr 2025-12-04T09:17:11.1949314Z * [new branch] yguo/new_latest_changes -> origin/yguo/new_latest_changes 2025-12-04T09:17:11.1951035Z * [new branch] yguo/patch_constexpr_changes -> origin/yguo/patch_constexpr_changes 2025-12-04T09:17:11.1953521Z * [new branch] yiming/bootcamp -> origin/yiming/bootcamp 2025-12-04T09:17:11.1955353Z * [new branch] yiming/run_with_start_end_rng_hop -> origin/yiming/run_with_start_end_rng_hop 2025-12-04T09:17:11.1957265Z * [new branch] yolo-llama3 -> origin/yolo-llama3 2025-12-04T09:17:11.1959775Z * [new branch] zainr/canary-test -> origin/zainr/canary-test 2025-12-04T09:17:11.1961698Z * [new branch] zainr/cleanup-gh-runners -> origin/zainr/cleanup-gh-runners 2025-12-04T09:17:11.1963454Z * [new branch] zainr/pull-migration-c -> origin/zainr/pull-migration-c 2025-12-04T09:17:11.1965047Z * [new branch] zainr/test2 -> origin/zainr/test2 2025-12-04T09:17:11.1967316Z * [new branch] zasdfgbnm-patch-3 -> origin/zasdfgbnm-patch-3 2025-12-04T09:17:11.1969152Z * [new branch] zb2p -> origin/zb2p 2025-12-04T09:17:11.1971221Z * [new branch] zeros-and-scatter-part2 -> origin/zeros-and-scatter-part2 2025-12-04T09:17:11.1974206Z * [new branch] zhxchen17/ci/vllm_lora_oom -> origin/zhxchen17/ci/vllm_lora_oom 2025-12-04T09:17:11.1976008Z * [new branch] zhxchen17/ci/vllm_multimodal_oom -> origin/zhxchen17/ci/vllm_multimodal_oom 2025-12-04T09:17:11.1977661Z * [new branch] zhxchen17/ci/vllm_pin -> origin/zhxchen17/ci/vllm_pin 2025-12-04T09:17:11.1980309Z * [new branch] zhxchen17/dynamo/unsafe_drop_all_guards -> origin/zhxchen17/dynamo/unsafe_drop_all_guards 2025-12-04T09:17:11.1982738Z * [new branch] zhxchen17/export/call_override -> origin/zhxchen17/export/call_override 2025-12-04T09:17:11.1984542Z * [new branch] zhxchen17/export/codemod1 -> origin/zhxchen17/export/codemod1 2025-12-04T09:17:11.1986475Z * [new branch] zhxchen17/export/ctx_return -> origin/zhxchen17/export/ctx_return 2025-12-04T09:17:11.1988380Z * [new branch] zhxchen17/export/disable_side_effect_warn -> origin/zhxchen17/export/disable_side_effect_warn 2025-12-04T09:17:11.1990093Z * [new branch] zhxchen17/export/pytree_check -> origin/zhxchen17/export/pytree_check 2025-12-04T09:17:11.1993106Z * [new branch] zhxchen17/precompile/aoti -> origin/zhxchen17/precompile/aoti 2025-12-04T09:17:11.1995175Z * [new branch] zhxchen17/precompile/globals -> origin/zhxchen17/precompile/globals 2025-12-04T09:17:11.1997019Z * [new branch] zhxchen17/precompile/inductor_guards -> origin/zhxchen17/precompile/inductor_guards 2025-12-04T09:17:11.1999338Z * [new branch] zhxchen17/scratch/0 -> origin/zhxchen17/scratch/0 2025-12-04T09:17:11.2001266Z * [new branch] zhxchen17/torch_export_api_update -> origin/zhxchen17/torch_export_api_update 2025-12-04T09:17:11.2003735Z * [new branch] zhxhcen17/moodycamel -> origin/zhxhcen17/moodycamel 2025-12-04T09:17:11.2006274Z * [new branch] zxiiro/build-times -> origin/zxiiro/build-times 2025-12-04T09:17:11.2008072Z * [new branch] zxiiro/c7i.2xlarge -> origin/zxiiro/c7i.2xlarge 2025-12-04T09:17:11.2009887Z * [new branch] zxiiro/c7i.2xlarge.h100 -> origin/zxiiro/c7i.2xlarge.h100 2025-12-04T09:17:11.2011746Z * [new branch] zxiiro/main -> origin/zxiiro/main 2025-12-04T09:17:11.2013491Z * [new branch] zxiiro/risc64 -> origin/zxiiro/risc64 2025-12-04T09:17:11.2015324Z * [new branch] zxiiro/test-multicloud-arc -> origin/zxiiro/test-multicloud-arc 2025-12-04T09:17:11.2017033Z * [new tag] bc2caa7fdf006894eff7af936babde69ab5a40f8-huydhn-debug -> bc2caa7fdf006894eff7af936babde69ab5a40f8-huydhn-debug 2025-12-04T09:17:11.2018655Z * [new tag] ci/binaries/77164 -> ci/binaries/77164 2025-12-04T09:17:11.2020311Z * [new tag] ciflow/b200/115316 -> ciflow/b200/115316 2025-12-04T09:17:11.2021578Z * [new tag] ciflow/b200/160685 -> ciflow/b200/160685 2025-12-04T09:17:11.2022591Z * [new tag] ciflow/b200/161607 -> ciflow/b200/161607 2025-12-04T09:17:11.2024490Z * [new tag] ciflow/b200/161938 -> ciflow/b200/161938 2025-12-04T09:17:11.2025487Z * [new tag] ciflow/b200/167207 -> ciflow/b200/167207 2025-12-04T09:17:11.2027059Z * [new tag] ciflow/b200/167989 -> ciflow/b200/167989 2025-12-04T09:17:11.2028341Z * [new tag] ciflow/b200/168096 -> ciflow/b200/168096 2025-12-04T09:17:11.2029704Z * [new tag] ciflow/b200/168175 -> ciflow/b200/168175 2025-12-04T09:17:11.2031173Z * [new tag] ciflow/b200/168195 -> ciflow/b200/168195 2025-12-04T09:17:11.2032078Z * [new tag] ciflow/b200/169200 -> ciflow/b200/169200 2025-12-04T09:17:11.2033717Z * [new tag] ciflow/b200/169216 -> ciflow/b200/169216 2025-12-04T09:17:11.2035462Z * [new tag] ciflow/b200/169380 -> ciflow/b200/169380 2025-12-04T09:17:11.2037285Z * [new tag] ciflow/b200/169412 -> ciflow/b200/169412 2025-12-04T09:17:11.2038810Z * [new tag] ciflow/b200/169470 -> ciflow/b200/169470 2025-12-04T09:17:11.2040115Z * [new tag] ciflow/b200/169471 -> ciflow/b200/169471 2025-12-04T09:17:11.2041448Z * [new tag] ciflow/b200/169472 -> ciflow/b200/169472 2025-12-04T09:17:11.2042941Z * [new tag] ciflow/b200/169514 -> ciflow/b200/169514 2025-12-04T09:17:11.2044537Z * [new tag] ciflow/b200/169517 -> ciflow/b200/169517 2025-12-04T09:17:11.2046289Z * [new tag] ciflow/binaries/165922 -> ciflow/binaries/165922 2025-12-04T09:17:11.2047225Z * [new tag] ciflow/binaries/169510 -> ciflow/binaries/169510 2025-12-04T09:17:11.2049167Z * [new tag] ciflow/binaries_wheel/157994 -> ciflow/binaries_wheel/157994 2025-12-04T09:17:11.2050568Z * [new tag] ciflow/binaries_wheel/166829 -> ciflow/binaries_wheel/166829 2025-12-04T09:17:11.2051532Z * [new tag] ciflow/binaries_wheel/167972 -> ciflow/binaries_wheel/167972 2025-12-04T09:17:11.2053294Z * [new tag] ciflow/binaries_wheel/167981 -> ciflow/binaries_wheel/167981 2025-12-04T09:17:11.2054782Z * [new tag] ciflow/dynamo/167695 -> ciflow/dynamo/167695 2025-12-04T09:17:11.2055727Z * [new tag] ciflow/dynamo/168096 -> ciflow/dynamo/168096 2025-12-04T09:17:11.2057534Z * [new tag] ciflow/dynamo/169525 -> ciflow/dynamo/169525 2025-12-04T09:17:11.2059129Z * [new tag] ciflow/h100-cutlass-backend/161938 -> ciflow/h100-cutlass-backend/161938 2025-12-04T09:17:11.2060034Z * [new tag] ciflow/h100-cutlass-backend/161940 -> ciflow/h100-cutlass-backend/161940 2025-12-04T09:17:11.2061907Z * [new tag] ciflow/h100-distributed/168923 -> ciflow/h100-distributed/168923 2025-12-04T09:17:11.2063499Z * [new tag] ciflow/h100-symm-mem/167552 -> ciflow/h100-symm-mem/167552 2025-12-04T09:17:11.2064661Z * [new tag] ciflow/h100-symm-mem/168129 -> ciflow/h100-symm-mem/168129 2025-12-04T09:17:11.2066129Z * [new tag] ciflow/h100-symm-mem/168917 -> ciflow/h100-symm-mem/168917 2025-12-04T09:17:11.2067490Z * [new tag] ciflow/h100-symm-mem/169156 -> ciflow/h100-symm-mem/169156 2025-12-04T09:17:11.2068404Z * [new tag] ciflow/h100-symm-mem/169200 -> ciflow/h100-symm-mem/169200 2025-12-04T09:17:11.2069959Z * [new tag] ciflow/h100-symm-mem/169216 -> ciflow/h100-symm-mem/169216 2025-12-04T09:17:11.2070910Z * [new tag] ciflow/h100-symm-mem/169338 -> ciflow/h100-symm-mem/169338 2025-12-04T09:17:11.2072520Z * [new tag] ciflow/h100-symm-mem/169355 -> ciflow/h100-symm-mem/169355 2025-12-04T09:17:11.2073478Z * [new tag] ciflow/h100-symm-mem/169543 -> ciflow/h100-symm-mem/169543 2025-12-04T09:17:11.2075242Z * [new tag] ciflow/h100/115316 -> ciflow/h100/115316 2025-12-04T09:17:11.2076613Z * [new tag] ciflow/h100/160685 -> ciflow/h100/160685 2025-12-04T09:17:11.2077480Z * [new tag] ciflow/h100/160729 -> ciflow/h100/160729 2025-12-04T09:17:11.2078961Z * [new tag] ciflow/h100/161607 -> ciflow/h100/161607 2025-12-04T09:17:11.2080262Z * [new tag] ciflow/h100/161938 -> ciflow/h100/161938 2025-12-04T09:17:11.2081280Z * [new tag] ciflow/h100/167207 -> ciflow/h100/167207 2025-12-04T09:17:11.2082628Z * [new tag] ciflow/h100/167989 -> ciflow/h100/167989 2025-12-04T09:17:11.2083560Z * [new tag] ciflow/h100/168096 -> ciflow/h100/168096 2025-12-04T09:17:11.2085109Z * [new tag] ciflow/h100/168175 -> ciflow/h100/168175 2025-12-04T09:17:11.2086044Z * [new tag] ciflow/h100/168195 -> ciflow/h100/168195 2025-12-04T09:17:11.2087521Z * [new tag] ciflow/h100/168980 -> ciflow/h100/168980 2025-12-04T09:17:11.2089150Z * [new tag] ciflow/h100/169200 -> ciflow/h100/169200 2025-12-04T09:17:11.2090842Z * [new tag] ciflow/h100/169216 -> ciflow/h100/169216 2025-12-04T09:17:11.2092325Z * [new tag] ciflow/h100/169380 -> ciflow/h100/169380 2025-12-04T09:17:11.2094140Z * [new tag] ciflow/h100/169412 -> ciflow/h100/169412 2025-12-04T09:17:11.2095522Z * [new tag] ciflow/h100/169470 -> ciflow/h100/169470 2025-12-04T09:17:11.2096859Z * [new tag] ciflow/h100/169471 -> ciflow/h100/169471 2025-12-04T09:17:11.2098029Z * [new tag] ciflow/h100/169472 -> ciflow/h100/169472 2025-12-04T09:17:11.2099430Z * [new tag] ciflow/h100/169514 -> ciflow/h100/169514 2025-12-04T09:17:11.2101009Z * [new tag] ciflow/inductor-cu126/168096 -> ciflow/inductor-cu126/168096 2025-12-04T09:17:11.2102952Z * [new tag] ciflow/inductor-micro-benchmark-cpu-x86/168096 -> ciflow/inductor-micro-benchmark-cpu-x86/168096 2025-12-04T09:17:11.2104502Z * [new tag] ciflow/inductor-micro-benchmark/166165 -> ciflow/inductor-micro-benchmark/166165 2025-12-04T09:17:11.2105486Z * [new tag] ciflow/inductor-micro-benchmark/168096 -> ciflow/inductor-micro-benchmark/168096 2025-12-04T09:17:11.2107296Z * [new tag] ciflow/inductor-perf-compare/168096 -> ciflow/inductor-perf-compare/168096 2025-12-04T09:17:11.2109226Z * [new tag] ciflow/inductor-perf-test-nightly-rocm-mi300/168073 -> ciflow/inductor-perf-test-nightly-rocm-mi300/168073 2025-12-04T09:17:11.2110228Z * [new tag] ciflow/inductor-perf-test-nightly-rocm-mi300/168096 -> ciflow/inductor-perf-test-nightly-rocm-mi300/168096 2025-12-04T09:17:11.2111890Z * [new tag] ciflow/inductor-perf-test-nightly-rocm-mi300/169024 -> ciflow/inductor-perf-test-nightly-rocm-mi300/169024 2025-12-04T09:17:11.2113456Z * [new tag] ciflow/inductor-perf-test-nightly-rocm-mi355/169024 -> ciflow/inductor-perf-test-nightly-rocm-mi355/169024 2025-12-04T09:17:11.2114584Z * [new tag] ciflow/inductor-perf-test-nightly/168096 -> ciflow/inductor-perf-test-nightly/168096 2025-12-04T09:17:11.2116280Z * [new tag] ciflow/inductor-periodic/168096 -> ciflow/inductor-periodic/168096 2025-12-04T09:17:11.2117251Z * [new tag] ciflow/inductor-periodic/169024 -> ciflow/inductor-periodic/169024 2025-12-04T09:17:11.2118943Z * [new tag] ciflow/inductor-periodic/169425 -> ciflow/inductor-periodic/169425 2025-12-04T09:17:11.2120633Z * [new tag] ciflow/inductor-rocm-mi200/165545 -> ciflow/inductor-rocm-mi200/165545 2025-12-04T09:17:11.2122014Z * [new tag] ciflow/inductor-rocm-mi200/165997 -> ciflow/inductor-rocm-mi200/165997 2025-12-04T09:17:11.2122924Z * [new tag] ciflow/inductor-rocm-mi200/168096 -> ciflow/inductor-rocm-mi200/168096 2025-12-04T09:17:11.2125164Z * [new tag] ciflow/inductor-rocm-mi200/169063 -> ciflow/inductor-rocm-mi200/169063 2025-12-04T09:17:11.2125800Z * [new tag] ciflow/inductor-rocm-mi200/169425 -> ciflow/inductor-rocm-mi200/169425 2025-12-04T09:17:11.2128014Z * [new tag] ciflow/inductor-rocm-mi300/165545 -> ciflow/inductor-rocm-mi300/165545 2025-12-04T09:17:11.2128845Z * [new tag] ciflow/inductor-rocm-mi300/168096 -> ciflow/inductor-rocm-mi300/168096 2025-12-04T09:17:11.2129922Z * [new tag] ciflow/inductor-rocm-mi300/169063 -> ciflow/inductor-rocm-mi300/169063 2025-12-04T09:17:11.2131062Z * [new tag] ciflow/inductor-rocm-mi300/169425 -> ciflow/inductor-rocm-mi300/169425 2025-12-04T09:17:11.2133860Z * [new tag] ciflow/inductor-rocm/162052 -> ciflow/inductor-rocm/162052 2025-12-04T09:17:11.2134885Z * [new tag] ciflow/inductor-rocm/168971 -> ciflow/inductor-rocm/168971 2025-12-04T09:17:11.2136822Z * [new tag] ciflow/inductor-windows/168096 -> ciflow/inductor-windows/168096 2025-12-04T09:17:11.2137918Z * [new tag] ciflow/inductor/144542 -> ciflow/inductor/144542 2025-12-04T09:17:11.2139100Z * [new tag] ciflow/inductor/146506 -> ciflow/inductor/146506 2025-12-04T09:17:11.2140352Z * [new tag] ciflow/inductor/147990 -> ciflow/inductor/147990 2025-12-04T09:17:11.2142954Z * [new tag] ciflow/inductor/148294 -> ciflow/inductor/148294 2025-12-04T09:17:11.2143839Z * [new tag] ciflow/inductor/148492 -> ciflow/inductor/148492 2025-12-04T09:17:11.2145205Z * [new tag] ciflow/inductor/157149 -> ciflow/inductor/157149 2025-12-04T09:17:11.2146983Z * [new tag] ciflow/inductor/157994 -> ciflow/inductor/157994 2025-12-04T09:17:11.2147858Z * [new tag] ciflow/inductor/160685 -> ciflow/inductor/160685 2025-12-04T09:17:11.2149011Z * [new tag] ciflow/inductor/160686 -> ciflow/inductor/160686 2025-12-04T09:17:11.2150231Z * [new tag] ciflow/inductor/160687 -> ciflow/inductor/160687 2025-12-04T09:17:11.2151928Z * [new tag] ciflow/inductor/160688 -> ciflow/inductor/160688 2025-12-04T09:17:11.2153174Z * [new tag] ciflow/inductor/160706 -> ciflow/inductor/160706 2025-12-04T09:17:11.2155273Z * [new tag] ciflow/inductor/160729 -> ciflow/inductor/160729 2025-12-04T09:17:11.2156555Z * [new tag] ciflow/inductor/161938 -> ciflow/inductor/161938 2025-12-04T09:17:11.2157798Z * [new tag] ciflow/inductor/161939 -> ciflow/inductor/161939 2025-12-04T09:17:11.2159202Z * [new tag] ciflow/inductor/161940 -> ciflow/inductor/161940 2025-12-04T09:17:11.2160602Z * [new tag] ciflow/inductor/162052 -> ciflow/inductor/162052 2025-12-04T09:17:11.2161739Z * [new tag] ciflow/inductor/162275 -> ciflow/inductor/162275 2025-12-04T09:17:11.2163343Z * [new tag] ciflow/inductor/162795 -> ciflow/inductor/162795 2025-12-04T09:17:11.2164577Z * [new tag] ciflow/inductor/163245 -> ciflow/inductor/163245 2025-12-04T09:17:11.2165899Z * [new tag] ciflow/inductor/163335 -> ciflow/inductor/163335 2025-12-04T09:17:11.2167493Z * [new tag] ciflow/inductor/163503 -> ciflow/inductor/163503 2025-12-04T09:17:11.2168653Z * [new tag] ciflow/inductor/163942 -> ciflow/inductor/163942 2025-12-04T09:17:11.2170320Z * [new tag] ciflow/inductor/165270 -> ciflow/inductor/165270 2025-12-04T09:17:11.2171393Z * [new tag] ciflow/inductor/165274 -> ciflow/inductor/165274 2025-12-04T09:17:11.2173057Z * [new tag] ciflow/inductor/165322 -> ciflow/inductor/165322 2025-12-04T09:17:11.2174076Z * [new tag] ciflow/inductor/165597 -> ciflow/inductor/165597 2025-12-04T09:17:11.2175317Z * [new tag] ciflow/inductor/166063 -> ciflow/inductor/166063 2025-12-04T09:17:11.2176922Z * [new tag] ciflow/inductor/166075 -> ciflow/inductor/166075 2025-12-04T09:17:11.2178123Z * [new tag] ciflow/inductor/166165 -> ciflow/inductor/166165 2025-12-04T09:17:11.2179815Z * [new tag] ciflow/inductor/166254 -> ciflow/inductor/166254 2025-12-04T09:17:11.2180845Z * [new tag] ciflow/inductor/166483 -> ciflow/inductor/166483 2025-12-04T09:17:11.2182242Z * [new tag] ciflow/inductor/166494 -> ciflow/inductor/166494 2025-12-04T09:17:11.2183614Z * [new tag] ciflow/inductor/166545 -> ciflow/inductor/166545 2025-12-04T09:17:11.2185231Z * [new tag] ciflow/inductor/166788 -> ciflow/inductor/166788 2025-12-04T09:17:11.2186389Z * [new tag] ciflow/inductor/166846 -> ciflow/inductor/166846 2025-12-04T09:17:11.2187603Z * [new tag] ciflow/inductor/167300 -> ciflow/inductor/167300 2025-12-04T09:17:11.2189212Z * [new tag] ciflow/inductor/167407 -> ciflow/inductor/167407 2025-12-04T09:17:11.2190614Z * [new tag] ciflow/inductor/167536 -> ciflow/inductor/167536 2025-12-04T09:17:11.2191720Z * [new tag] ciflow/inductor/167552 -> ciflow/inductor/167552 2025-12-04T09:17:11.2193113Z * [new tag] ciflow/inductor/167555 -> ciflow/inductor/167555 2025-12-04T09:17:11.2194813Z * [new tag] ciflow/inductor/167583 -> ciflow/inductor/167583 2025-12-04T09:17:11.2195979Z * [new tag] ciflow/inductor/167599 -> ciflow/inductor/167599 2025-12-04T09:17:11.2197147Z * [new tag] ciflow/inductor/167647 -> ciflow/inductor/167647 2025-12-04T09:17:11.2198760Z * [new tag] ciflow/inductor/167677 -> ciflow/inductor/167677 2025-12-04T09:17:11.2199840Z * [new tag] ciflow/inductor/167680 -> ciflow/inductor/167680 2025-12-04T09:17:11.2201079Z * [new tag] ciflow/inductor/167695 -> ciflow/inductor/167695 2025-12-04T09:17:11.2202714Z * [new tag] ciflow/inductor/167742 -> ciflow/inductor/167742 2025-12-04T09:17:11.2203743Z * [new tag] ciflow/inductor/167768 -> ciflow/inductor/167768 2025-12-04T09:17:11.2205585Z * [new tag] ciflow/inductor/167773 -> ciflow/inductor/167773 2025-12-04T09:17:11.2206713Z * [new tag] ciflow/inductor/167781 -> ciflow/inductor/167781 2025-12-04T09:17:11.2208114Z * [new tag] ciflow/inductor/167880 -> ciflow/inductor/167880 2025-12-04T09:17:11.2209490Z * [new tag] ciflow/inductor/167887 -> ciflow/inductor/167887 2025-12-04T09:17:11.2210673Z * [new tag] ciflow/inductor/167972 -> ciflow/inductor/167972 2025-12-04T09:17:11.2212414Z * [new tag] ciflow/inductor/167989 -> ciflow/inductor/167989 2025-12-04T09:17:11.2213433Z * [new tag] ciflow/inductor/168002 -> ciflow/inductor/168002 2025-12-04T09:17:11.2214819Z * [new tag] ciflow/inductor/168050 -> ciflow/inductor/168050 2025-12-04T09:17:11.2216023Z * [new tag] ciflow/inductor/168051 -> ciflow/inductor/168051 2025-12-04T09:17:11.2217604Z * [new tag] ciflow/inductor/168052 -> ciflow/inductor/168052 2025-12-04T09:17:11.2218684Z * [new tag] ciflow/inductor/168073 -> ciflow/inductor/168073 2025-12-04T09:17:11.2220220Z * [new tag] ciflow/inductor/168096 -> ciflow/inductor/168096 2025-12-04T09:17:11.2221301Z * [new tag] ciflow/inductor/168114 -> ciflow/inductor/168114 2025-12-04T09:17:11.2222745Z * [new tag] ciflow/inductor/168115 -> ciflow/inductor/168115 2025-12-04T09:17:11.2224499Z * [new tag] ciflow/inductor/168127 -> ciflow/inductor/168127 2025-12-04T09:17:11.2225573Z * [new tag] ciflow/inductor/168129 -> ciflow/inductor/168129 2025-12-04T09:17:11.2227152Z * [new tag] ciflow/inductor/168157 -> ciflow/inductor/168157 2025-12-04T09:17:11.2228352Z * [new tag] ciflow/inductor/168175 -> ciflow/inductor/168175 2025-12-04T09:17:11.2230301Z * [new tag] ciflow/inductor/168185 -> ciflow/inductor/168185 2025-12-04T09:17:11.2231367Z * [new tag] ciflow/inductor/168195 -> ciflow/inductor/168195 2025-12-04T09:17:11.2232657Z * [new tag] ciflow/inductor/168209 -> ciflow/inductor/168209 2025-12-04T09:17:11.2234311Z * [new tag] ciflow/inductor/168266 -> ciflow/inductor/168266 2025-12-04T09:17:11.2235380Z * [new tag] ciflow/inductor/168316 -> ciflow/inductor/168316 2025-12-04T09:17:11.2237144Z * [new tag] ciflow/inductor/168326 -> ciflow/inductor/168326 2025-12-04T09:17:11.2238181Z * [new tag] ciflow/inductor/168368 -> ciflow/inductor/168368 2025-12-04T09:17:11.2239826Z * [new tag] ciflow/inductor/168894 -> ciflow/inductor/168894 2025-12-04T09:17:11.2240817Z * [new tag] ciflow/inductor/168934 -> ciflow/inductor/168934 2025-12-04T09:17:11.2242210Z * [new tag] ciflow/inductor/168939 -> ciflow/inductor/168939 2025-12-04T09:17:11.2243791Z * [new tag] ciflow/inductor/168946 -> ciflow/inductor/168946 2025-12-04T09:17:11.2244817Z * [new tag] ciflow/inductor/168950 -> ciflow/inductor/168950 2025-12-04T09:17:11.2246378Z * [new tag] ciflow/inductor/168951 -> ciflow/inductor/168951 2025-12-04T09:17:11.2247457Z * [new tag] ciflow/inductor/168952 -> ciflow/inductor/168952 2025-12-04T09:17:11.2248860Z * [new tag] ciflow/inductor/168955 -> ciflow/inductor/168955 2025-12-04T09:17:11.2250075Z * [new tag] ciflow/inductor/168971 -> ciflow/inductor/168971 2025-12-04T09:17:11.2251665Z * [new tag] ciflow/inductor/168979 -> ciflow/inductor/168979 2025-12-04T09:17:11.2252748Z * [new tag] ciflow/inductor/168980 -> ciflow/inductor/168980 2025-12-04T09:17:11.2254532Z * [new tag] ciflow/inductor/168983 -> ciflow/inductor/168983 2025-12-04T09:17:11.2255600Z * [new tag] ciflow/inductor/169006 -> ciflow/inductor/169006 2025-12-04T09:17:11.2257274Z * [new tag] ciflow/inductor/169023 -> ciflow/inductor/169023 2025-12-04T09:17:11.2258299Z * [new tag] ciflow/inductor/169024 -> ciflow/inductor/169024 2025-12-04T09:17:11.2259928Z * [new tag] ciflow/inductor/169025 -> ciflow/inductor/169025 2025-12-04T09:17:11.2260944Z * [new tag] ciflow/inductor/169066 -> ciflow/inductor/169066 2025-12-04T09:17:11.2262628Z * [new tag] ciflow/inductor/169091 -> ciflow/inductor/169091 2025-12-04T09:17:11.2263714Z * [new tag] ciflow/inductor/169102 -> ciflow/inductor/169102 2025-12-04T09:17:11.2265350Z * [new tag] ciflow/inductor/169103 -> ciflow/inductor/169103 2025-12-04T09:17:11.2266323Z * [new tag] ciflow/inductor/169121 -> ciflow/inductor/169121 2025-12-04T09:17:11.2267888Z * [new tag] ciflow/inductor/169134 -> ciflow/inductor/169134 2025-12-04T09:17:11.2268965Z * [new tag] ciflow/inductor/169135 -> ciflow/inductor/169135 2025-12-04T09:17:11.2270566Z * [new tag] ciflow/inductor/169141 -> ciflow/inductor/169141 2025-12-04T09:17:11.2271594Z * [new tag] ciflow/inductor/169151 -> ciflow/inductor/169151 2025-12-04T09:17:11.2273283Z * [new tag] ciflow/inductor/169161 -> ciflow/inductor/169161 2025-12-04T09:17:11.2274388Z * [new tag] ciflow/inductor/169167 -> ciflow/inductor/169167 2025-12-04T09:17:11.2276257Z * [new tag] ciflow/inductor/169177 -> ciflow/inductor/169177 2025-12-04T09:17:11.2277517Z * [new tag] ciflow/inductor/169185 -> ciflow/inductor/169185 2025-12-04T09:17:11.2278652Z * [new tag] ciflow/inductor/169196 -> ciflow/inductor/169196 2025-12-04T09:17:11.2280279Z * [new tag] ciflow/inductor/169200 -> ciflow/inductor/169200 2025-12-04T09:17:11.2281424Z * [new tag] ciflow/inductor/169204 -> ciflow/inductor/169204 2025-12-04T09:17:11.2282674Z * [new tag] ciflow/inductor/169216 -> ciflow/inductor/169216 2025-12-04T09:17:11.2284069Z * [new tag] ciflow/inductor/169219 -> ciflow/inductor/169219 2025-12-04T09:17:11.2285262Z * [new tag] ciflow/inductor/169220 -> ciflow/inductor/169220 2025-12-04T09:17:11.2287042Z * [new tag] ciflow/inductor/169230 -> ciflow/inductor/169230 2025-12-04T09:17:11.2288159Z * [new tag] ciflow/inductor/169242 -> ciflow/inductor/169242 2025-12-04T09:17:11.2289731Z * [new tag] ciflow/inductor/169245 -> ciflow/inductor/169245 2025-12-04T09:17:11.2290967Z * [new tag] ciflow/inductor/169260 -> ciflow/inductor/169260 2025-12-04T09:17:11.2292366Z * [new tag] ciflow/inductor/169282 -> ciflow/inductor/169282 2025-12-04T09:17:11.2293588Z * [new tag] ciflow/inductor/169286 -> ciflow/inductor/169286 2025-12-04T09:17:11.2295212Z * [new tag] ciflow/inductor/169299 -> ciflow/inductor/169299 2025-12-04T09:17:11.2296382Z * [new tag] ciflow/inductor/169304 -> ciflow/inductor/169304 2025-12-04T09:17:11.2298353Z * [new tag] ciflow/inductor/169305 -> ciflow/inductor/169305 2025-12-04T09:17:11.2299529Z * [new tag] ciflow/inductor/169308 -> ciflow/inductor/169308 2025-12-04T09:17:11.2300916Z * [new tag] ciflow/inductor/169319 -> ciflow/inductor/169319 2025-12-04T09:17:11.2302146Z * [new tag] ciflow/inductor/169326 -> ciflow/inductor/169326 2025-12-04T09:17:11.2303880Z * [new tag] ciflow/inductor/169332 -> ciflow/inductor/169332 2025-12-04T09:17:11.2305007Z * [new tag] ciflow/inductor/169333 -> ciflow/inductor/169333 2025-12-04T09:17:11.2306863Z * [new tag] ciflow/inductor/169336 -> ciflow/inductor/169336 2025-12-04T09:17:11.2307884Z * [new tag] ciflow/inductor/169340 -> ciflow/inductor/169340 2025-12-04T09:17:11.2309513Z * [new tag] ciflow/inductor/169341 -> ciflow/inductor/169341 2025-12-04T09:17:11.2310539Z * [new tag] ciflow/inductor/169343 -> ciflow/inductor/169343 2025-12-04T09:17:11.2312198Z * [new tag] ciflow/inductor/169346 -> ciflow/inductor/169346 2025-12-04T09:17:11.2313370Z * [new tag] ciflow/inductor/169348 -> ciflow/inductor/169348 2025-12-04T09:17:11.2315088Z * [new tag] ciflow/inductor/169350 -> ciflow/inductor/169350 2025-12-04T09:17:11.2316206Z * [new tag] ciflow/inductor/169355 -> ciflow/inductor/169355 2025-12-04T09:17:11.2318236Z * [new tag] ciflow/inductor/169370 -> ciflow/inductor/169370 2025-12-04T09:17:11.2319960Z * [new tag] ciflow/inductor/169375 -> ciflow/inductor/169375 2025-12-04T09:17:11.2321026Z * [new tag] ciflow/inductor/169389 -> ciflow/inductor/169389 2025-12-04T09:17:11.2322597Z * [new tag] ciflow/inductor/169391 -> ciflow/inductor/169391 2025-12-04T09:17:11.2323743Z * [new tag] ciflow/inductor/169393 -> ciflow/inductor/169393 2025-12-04T09:17:11.2325064Z * [new tag] ciflow/inductor/169399 -> ciflow/inductor/169399 2025-12-04T09:17:11.2329167Z * [new tag] ciflow/inductor/169400 -> ciflow/inductor/169400 2025-12-04T09:17:11.2330235Z * [new tag] ciflow/inductor/169415 -> ciflow/inductor/169415 2025-12-04T09:17:11.2332019Z * [new tag] ciflow/inductor/169417 -> ciflow/inductor/169417 2025-12-04T09:17:11.2332968Z * [new tag] ciflow/inductor/169418 -> ciflow/inductor/169418 2025-12-04T09:17:11.2334755Z * [new tag] ciflow/inductor/169430 -> ciflow/inductor/169430 2025-12-04T09:17:11.2335838Z * [new tag] ciflow/inductor/169432 -> ciflow/inductor/169432 2025-12-04T09:17:11.2337438Z * [new tag] ciflow/inductor/169436 -> ciflow/inductor/169436 2025-12-04T09:17:11.2338646Z * [new tag] ciflow/inductor/169437 -> ciflow/inductor/169437 2025-12-04T09:17:11.2340219Z * [new tag] ciflow/inductor/169438 -> ciflow/inductor/169438 2025-12-04T09:17:11.2341308Z * [new tag] ciflow/inductor/169441 -> ciflow/inductor/169441 2025-12-04T09:17:11.2343005Z * [new tag] ciflow/inductor/169446 -> ciflow/inductor/169446 2025-12-04T09:17:11.2344554Z * [new tag] ciflow/inductor/169447 -> ciflow/inductor/169447 2025-12-04T09:17:11.2345559Z * [new tag] ciflow/inductor/169452 -> ciflow/inductor/169452 2025-12-04T09:17:11.2347530Z * [new tag] ciflow/inductor/169455 -> ciflow/inductor/169455 2025-12-04T09:17:11.2348709Z * [new tag] ciflow/inductor/169459 -> ciflow/inductor/169459 2025-12-04T09:17:11.2350471Z * [new tag] ciflow/inductor/169463 -> ciflow/inductor/169463 2025-12-04T09:17:11.2351648Z * [new tag] ciflow/inductor/169476 -> ciflow/inductor/169476 2025-12-04T09:17:11.2353201Z * [new tag] ciflow/inductor/169485 -> ciflow/inductor/169485 2025-12-04T09:17:11.2354301Z * [new tag] ciflow/inductor/169493 -> ciflow/inductor/169493 2025-12-04T09:17:11.2355985Z * [new tag] ciflow/inductor/169496 -> ciflow/inductor/169496 2025-12-04T09:17:11.2366063Z * [new tag] ciflow/inductor/169497 -> ciflow/inductor/169497 2025-12-04T09:17:11.2366611Z * [new tag] ciflow/inductor/169503 -> ciflow/inductor/169503 2025-12-04T09:17:11.2367127Z * [new tag] ciflow/inductor/169504 -> ciflow/inductor/169504 2025-12-04T09:17:11.2367640Z * [new tag] ciflow/inductor/169505 -> ciflow/inductor/169505 2025-12-04T09:17:11.2368155Z * [new tag] ciflow/inductor/169508 -> ciflow/inductor/169508 2025-12-04T09:17:11.2368668Z * [new tag] ciflow/inductor/169509 -> ciflow/inductor/169509 2025-12-04T09:17:11.2369181Z * [new tag] ciflow/inductor/169513 -> ciflow/inductor/169513 2025-12-04T09:17:11.2369705Z * [new tag] ciflow/inductor/169514 -> ciflow/inductor/169514 2025-12-04T09:17:11.2370211Z * [new tag] ciflow/inductor/169515 -> ciflow/inductor/169515 2025-12-04T09:17:11.2370723Z * [new tag] ciflow/inductor/169517 -> ciflow/inductor/169517 2025-12-04T09:17:11.2371218Z * [new tag] ciflow/inductor/169519 -> ciflow/inductor/169519 2025-12-04T09:17:11.2372804Z * [new tag] ciflow/inductor/169520 -> ciflow/inductor/169520 2025-12-04T09:17:11.2373853Z * [new tag] ciflow/inductor/169521 -> ciflow/inductor/169521 2025-12-04T09:17:11.2375464Z * [new tag] ciflow/inductor/169524 -> ciflow/inductor/169524 2025-12-04T09:17:11.2376514Z * [new tag] ciflow/inductor/169527 -> ciflow/inductor/169527 2025-12-04T09:17:11.2378143Z * [new tag] ciflow/inductor/169528 -> ciflow/inductor/169528 2025-12-04T09:17:11.2379288Z * [new tag] ciflow/inductor/169532 -> ciflow/inductor/169532 2025-12-04T09:17:11.2380919Z * [new tag] ciflow/inductor/169535 -> ciflow/inductor/169535 2025-12-04T09:17:11.2381976Z * [new tag] ciflow/inductor/169536 -> ciflow/inductor/169536 2025-12-04T09:17:11.2383798Z * [new tag] ciflow/inductor/169547 -> ciflow/inductor/169547 2025-12-04T09:17:11.2384816Z * [new tag] ciflow/inductor/169548 -> ciflow/inductor/169548 2025-12-04T09:17:11.2386159Z * [new tag] ciflow/inductor/169549 -> ciflow/inductor/169549 2025-12-04T09:17:11.2387453Z * [new tag] ciflow/inductor/169551 -> ciflow/inductor/169551 2025-12-04T09:17:11.2389096Z * [new tag] ciflow/inductor/169552 -> ciflow/inductor/169552 2025-12-04T09:17:11.2390089Z * [new tag] ciflow/inductor/169553 -> ciflow/inductor/169553 2025-12-04T09:17:11.2391783Z * [new tag] ciflow/inductor/169557 -> ciflow/inductor/169557 2025-12-04T09:17:11.2393700Z * [new tag] ciflow/inductor/3b9a386 -> ciflow/inductor/3b9a386 2025-12-04T09:17:11.2394707Z * [new tag] ciflow/inductor/3d4b92b -> ciflow/inductor/3d4b92b 2025-12-04T09:17:11.2396427Z * [new tag] ciflow/inductor/d224ac7 -> ciflow/inductor/d224ac7 2025-12-04T09:17:11.2397973Z * [new tag] ciflow/linux-aarch64/157994 -> ciflow/linux-aarch64/157994 2025-12-04T09:17:11.2399670Z * [new tag] ciflow/linux-aarch64/166075 -> ciflow/linux-aarch64/166075 2025-12-04T09:17:11.2400701Z * [new tag] ciflow/linux-aarch64/166876 -> ciflow/linux-aarch64/166876 2025-12-04T09:17:11.2401867Z * [new tag] ciflow/linux-aarch64/167981 -> ciflow/linux-aarch64/167981 2025-12-04T09:17:11.2403674Z * [new tag] ciflow/mps/166254 -> ciflow/mps/166254 2025-12-04T09:17:11.2404623Z * [new tag] ciflow/mps/169017 -> ciflow/mps/169017 2025-12-04T09:17:11.2406238Z * [new tag] ciflow/mps/169372 -> ciflow/mps/169372 2025-12-04T09:17:11.2407229Z * [new tag] ciflow/mps/169478 -> ciflow/mps/169478 2025-12-04T09:17:11.2409180Z * [new tag] ciflow/op-benchmark/157994 -> ciflow/op-benchmark/157994 2025-12-04T09:17:11.2410096Z * [new tag] ciflow/op-benchmark/166075 -> ciflow/op-benchmark/166075 2025-12-04T09:17:11.2411313Z * [new tag] ciflow/op-benchmark/169544 -> ciflow/op-benchmark/169544 2025-12-04T09:17:11.2413257Z * [new tag] ciflow/periodic-rocm-mi200/165997 -> ciflow/periodic-rocm-mi200/165997 2025-12-04T09:17:11.2414329Z * [new tag] ciflow/periodic-rocm-mi200/166517 -> ciflow/periodic-rocm-mi200/166517 2025-12-04T09:17:11.2415517Z * [new tag] ciflow/periodic-rocm-mi200/169063 -> ciflow/periodic-rocm-mi200/169063 2025-12-04T09:17:11.2416795Z * [new tag] ciflow/periodic-rocm-mi200/169425 -> ciflow/periodic-rocm-mi200/169425 2025-12-04T09:17:11.2418687Z * [new tag] ciflow/periodic-rocm-mi300/166517 -> ciflow/periodic-rocm-mi300/166517 2025-12-04T09:17:11.2419607Z * [new tag] ciflow/periodic-rocm-mi300/169063 -> ciflow/periodic-rocm-mi300/169063 2025-12-04T09:17:11.2420791Z * [new tag] ciflow/periodic-rocm-mi300/169425 -> ciflow/periodic-rocm-mi300/169425 2025-12-04T09:17:11.2422932Z * [new tag] ciflow/periodic/054a2fd -> ciflow/periodic/054a2fd 2025-12-04T09:17:11.2423956Z * [new tag] ciflow/periodic/167207 -> ciflow/periodic/167207 2025-12-04T09:17:11.2425880Z * [new tag] ciflow/periodic/167978 -> ciflow/periodic/167978 2025-12-04T09:17:11.2426898Z * [new tag] ciflow/periodic/168096 -> ciflow/periodic/168096 2025-12-04T09:17:11.2428102Z * [new tag] ciflow/periodic/169286 -> ciflow/periodic/169286 2025-12-04T09:17:11.2429695Z * [new tag] ciflow/periodic/2a6d37d -> ciflow/periodic/2a6d37d 2025-12-04T09:17:11.2431029Z * [new tag] ciflow/periodic/317eeb8 -> ciflow/periodic/317eeb8 2025-12-04T09:17:11.2432992Z * [new tag] ciflow/periodic/3c32 -> ciflow/periodic/3c32 2025-12-04T09:17:11.2433897Z * [new tag] ciflow/periodic/3e98831 -> ciflow/periodic/3e98831 2025-12-04T09:17:11.2436339Z * [new tag] ciflow/periodic/7c648509a7470ace9fb2bae960dd4790f7e943e9 -> ciflow/periodic/7c648509a7470ace9fb2bae960dd4790f7e943e9 2025-12-04T09:17:11.2437928Z * [new tag] ciflow/periodic/94512-point -> ciflow/periodic/94512-point 2025-12-04T09:17:11.2440107Z * [new tag] ciflow/periodic/csl/test87519 -> ciflow/periodic/csl/test87519 2025-12-04T09:17:11.2441279Z * [new tag] ciflow/periodic/csltest88275 -> ciflow/periodic/csltest88275 2025-12-04T09:17:11.2443006Z * [new tag] ciflow/periodic/csltest88761 -> ciflow/periodic/csltest88761 2025-12-04T09:17:11.2444285Z * [new tag] ciflow/periodic/release_1.12 -> ciflow/periodic/release_1.12 2025-12-04T09:17:11.2446295Z * [new tag] ciflow/periodic/release_1.12.0 -> ciflow/periodic/release_1.12.0 2025-12-04T09:17:11.2447574Z * [new tag] ciflow/periodic/sha-ec5b83 -> ciflow/periodic/sha-ec5b83 2025-12-04T09:17:11.2449350Z * [new tag] ciflow/pull/167207 -> ciflow/pull/167207 2025-12-04T09:17:11.2451172Z * [new tag] ciflow/quantization-periodic/169207 -> ciflow/quantization-periodic/169207 2025-12-04T09:17:11.2452308Z * [new tag] ciflow/rocm-mi200/165545 -> ciflow/rocm-mi200/165545 2025-12-04T09:17:11.2453490Z * [new tag] ciflow/rocm-mi200/165997 -> ciflow/rocm-mi200/165997 2025-12-04T09:17:11.2454714Z * [new tag] ciflow/rocm-mi200/168096 -> ciflow/rocm-mi200/168096 2025-12-04T09:17:11.2456419Z * [new tag] ciflow/rocm-mi200/168275 -> ciflow/rocm-mi200/168275 2025-12-04T09:17:11.2457399Z * [new tag] ciflow/rocm-mi200/169063 -> ciflow/rocm-mi200/169063 2025-12-04T09:17:11.2459025Z * [new tag] ciflow/rocm-mi200/169356 -> ciflow/rocm-mi200/169356 2025-12-04T09:17:11.2459981Z * [new tag] ciflow/rocm-mi200/169425 -> ciflow/rocm-mi200/169425 2025-12-04T09:17:11.2461788Z * [new tag] ciflow/rocm-mi300/165545 -> ciflow/rocm-mi300/165545 2025-12-04T09:17:11.2463068Z * [new tag] ciflow/rocm-mi300/167157 -> ciflow/rocm-mi300/167157 2025-12-04T09:17:11.2464266Z * [new tag] ciflow/rocm-mi300/168096 -> ciflow/rocm-mi300/168096 2025-12-04T09:17:11.2465456Z * [new tag] ciflow/rocm-mi300/169063 -> ciflow/rocm-mi300/169063 2025-12-04T09:17:11.2466711Z * [new tag] ciflow/rocm-mi300/169425 -> ciflow/rocm-mi300/169425 2025-12-04T09:17:11.2468658Z * [new tag] ciflow/rocm-mi355/167157 -> ciflow/rocm-mi355/167157 2025-12-04T09:17:11.2470352Z * [new tag] ciflow/rocm-mi355/168275 -> ciflow/rocm-mi355/168275 2025-12-04T09:17:11.2471299Z * [new tag] ciflow/rocm-mi355/169425 -> ciflow/rocm-mi355/169425 2025-12-04T09:17:11.2473095Z * [new tag] ciflow/rocm-navi31/168275 -> ciflow/rocm-navi31/168275 2025-12-04T09:17:11.2474045Z * [new tag] ciflow/rocm-navi31/169425 -> ciflow/rocm-navi31/169425 2025-12-04T09:17:11.2475800Z * [new tag] ciflow/rocm/115316 -> ciflow/rocm/115316 2025-12-04T09:17:11.2476764Z * [new tag] ciflow/rocm/148492 -> ciflow/rocm/148492 2025-12-04T09:17:11.2477959Z * [new tag] ciflow/rocm/160685 -> ciflow/rocm/160685 2025-12-04T09:17:11.2479367Z * [new tag] ciflow/rocm/161607 -> ciflow/rocm/161607 2025-12-04T09:17:11.2480449Z * [new tag] ciflow/rocm/162052 -> ciflow/rocm/162052 2025-12-04T09:17:11.2481653Z * [new tag] ciflow/rocm/165997 -> ciflow/rocm/165997 2025-12-04T09:17:11.2483136Z * [new tag] ciflow/rocm/166165 -> ciflow/rocm/166165 2025-12-04T09:17:11.2484140Z * [new tag] ciflow/rocm/166517 -> ciflow/rocm/166517 2025-12-04T09:17:11.2485520Z * [new tag] ciflow/rocm/167207 -> ciflow/rocm/167207 2025-12-04T09:17:11.2486630Z * [new tag] ciflow/rocm/167536 -> ciflow/rocm/167536 2025-12-04T09:17:11.2487847Z * [new tag] ciflow/rocm/167781 -> ciflow/rocm/167781 2025-12-04T09:17:11.2489822Z * [new tag] ciflow/rocm/167989 -> ciflow/rocm/167989 2025-12-04T09:17:11.2491538Z * [new tag] ciflow/rocm/168073 -> ciflow/rocm/168073 2025-12-04T09:17:11.2492727Z * [new tag] ciflow/rocm/168195 -> ciflow/rocm/168195 2025-12-04T09:17:11.2494326Z * [new tag] ciflow/rocm/168939 -> ciflow/rocm/168939 2025-12-04T09:17:11.2495346Z * [new tag] ciflow/rocm/168971 -> ciflow/rocm/168971 2025-12-04T09:17:11.2496941Z * [new tag] ciflow/rocm/169024 -> ciflow/rocm/169024 2025-12-04T09:17:11.2498007Z * [new tag] ciflow/rocm/169200 -> ciflow/rocm/169200 2025-12-04T09:17:11.2499420Z * [new tag] ciflow/rocm/169216 -> ciflow/rocm/169216 2025-12-04T09:17:11.2500570Z * [new tag] ciflow/rocm/169312 -> ciflow/rocm/169312 2025-12-04T09:17:11.2502128Z * [new tag] ciflow/rocm/169380 -> ciflow/rocm/169380 2025-12-04T09:17:11.2503291Z * [new tag] ciflow/rocm/169427 -> ciflow/rocm/169427 2025-12-04T09:17:11.2504903Z * [new tag] ciflow/rocm/169455 -> ciflow/rocm/169455 2025-12-04T09:17:11.2505946Z * [new tag] ciflow/rocm/169470 -> ciflow/rocm/169470 2025-12-04T09:17:11.2507592Z * [new tag] ciflow/rocm/169471 -> ciflow/rocm/169471 2025-12-04T09:17:11.2508636Z * [new tag] ciflow/rocm/169472 -> ciflow/rocm/169472 2025-12-04T09:17:11.2510190Z * [new tag] ciflow/rocm/169514 -> ciflow/rocm/169514 2025-12-04T09:17:11.2511937Z * [new tag] ciflow/slow/01c7106 -> ciflow/slow/01c7106 2025-12-04T09:17:11.2513023Z * [new tag] ciflow/slow/0577043 -> ciflow/slow/0577043 2025-12-04T09:17:11.2515106Z * [new tag] ciflow/slow/0d5b74da0cab798fbfdb9caa53fad816999c8386-sdym -> ciflow/slow/0d5b74da0cab798fbfdb9caa53fad816999c8386-sdym 2025-12-04T09:17:11.2516020Z * [new tag] ciflow/slow/0e81104 -> ciflow/slow/0e81104 2025-12-04T09:17:11.2517161Z * [new tag] ciflow/slow/167207 -> ciflow/slow/167207 2025-12-04T09:17:11.2518580Z * [new tag] ciflow/slow/168050 -> ciflow/slow/168050 2025-12-04T09:17:11.2520167Z * [new tag] ciflow/slow/1732077 -> ciflow/slow/1732077 2025-12-04T09:17:11.2521659Z * [new tag] ciflow/slow/187eb7c -> ciflow/slow/187eb7c 2025-12-04T09:17:11.2523435Z * [new tag] ciflow/slow/1faef89 -> ciflow/slow/1faef89 2025-12-04T09:17:11.2525622Z * [new tag] ciflow/slow/3920ec1 -> ciflow/slow/3920ec1 2025-12-04T09:17:11.2526795Z * [new tag] ciflow/slow/3b7c6b2 -> ciflow/slow/3b7c6b2 2025-12-04T09:17:11.2528657Z * [new tag] ciflow/slow/59a3759 -> ciflow/slow/59a3759 2025-12-04T09:17:11.2529795Z * [new tag] ciflow/slow/70ef0bb -> ciflow/slow/70ef0bb 2025-12-04T09:17:11.2531612Z * [new tag] ciflow/slow/788ff06 -> ciflow/slow/788ff06 2025-12-04T09:17:11.2533325Z * [new tag] ciflow/slow/8751002215790a3a88750faa8f4366933e296693-sdym -> ciflow/slow/8751002215790a3a88750faa8f4366933e296693-sdym 2025-12-04T09:17:11.2534295Z * [new tag] ciflow/slow/9d85864 -> ciflow/slow/9d85864 2025-12-04T09:17:11.2536186Z * [new tag] ciflow/slow/9ffad5b -> ciflow/slow/9ffad5b 2025-12-04T09:17:11.2537250Z * [new tag] ciflow/slow/a206e8b -> ciflow/slow/a206e8b 2025-12-04T09:17:11.2538997Z * [new tag] ciflow/slow/a837609 -> ciflow/slow/a837609 2025-12-04T09:17:11.2540284Z * [new tag] ciflow/slow/af841f3 -> ciflow/slow/af841f3 2025-12-04T09:17:11.2542478Z * [new tag] ciflow/slow/da3aba1e46157c4df504b067477cdf2b3c96b194-sdym -> ciflow/slow/da3aba1e46157c4df504b067477cdf2b3c96b194-sdym 2025-12-04T09:17:11.2543681Z * [new tag] ciflow/torchbench/168175 -> ciflow/torchbench/168175 2025-12-04T09:17:11.2545446Z * [new tag] ciflow/trunk/148492 -> ciflow/trunk/148492 2025-12-04T09:17:11.2546403Z * [new tag] ciflow/trunk/157149 -> ciflow/trunk/157149 2025-12-04T09:17:11.2547613Z * [new tag] ciflow/trunk/157994 -> ciflow/trunk/157994 2025-12-04T09:17:11.2548797Z * [new tag] ciflow/trunk/159718 -> ciflow/trunk/159718 2025-12-04T09:17:11.2550405Z * [new tag] ciflow/trunk/160685 -> ciflow/trunk/160685 2025-12-04T09:17:11.2551397Z * [new tag] ciflow/trunk/160729 -> ciflow/trunk/160729 2025-12-04T09:17:11.2552565Z * [new tag] ciflow/trunk/162275 -> ciflow/trunk/162275 2025-12-04T09:17:11.2554059Z * [new tag] ciflow/trunk/162795 -> ciflow/trunk/162795 2025-12-04T09:17:11.2555094Z * [new tag] ciflow/trunk/163245 -> ciflow/trunk/163245 2025-12-04T09:17:11.2556298Z * [new tag] ciflow/trunk/163942 -> ciflow/trunk/163942 2025-12-04T09:17:11.2557893Z * [new tag] ciflow/trunk/165274 -> ciflow/trunk/165274 2025-12-04T09:17:11.2559668Z * [new tag] ciflow/trunk/165483 -> ciflow/trunk/165483 2025-12-04T09:17:11.2561342Z * [new tag] ciflow/trunk/165728 -> ciflow/trunk/165728 2025-12-04T09:17:11.2562715Z * [new tag] ciflow/trunk/165922 -> ciflow/trunk/165922 2025-12-04T09:17:11.2563860Z * [new tag] ciflow/trunk/166075 -> ciflow/trunk/166075 2025-12-04T09:17:11.2565490Z * [new tag] ciflow/trunk/166165 -> ciflow/trunk/166165 2025-12-04T09:17:11.2566502Z * [new tag] ciflow/trunk/166829 -> ciflow/trunk/166829 2025-12-04T09:17:11.2568281Z * [new tag] ciflow/trunk/166843 -> ciflow/trunk/166843 2025-12-04T09:17:11.2569335Z * [new tag] ciflow/trunk/166876 -> ciflow/trunk/166876 2025-12-04T09:17:11.2570906Z * [new tag] ciflow/trunk/167207 -> ciflow/trunk/167207 2025-12-04T09:17:11.2572017Z * [new tag] ciflow/trunk/167536 -> ciflow/trunk/167536 2025-12-04T09:17:11.2573600Z * [new tag] ciflow/trunk/167552 -> ciflow/trunk/167552 2025-12-04T09:17:11.2574626Z * [new tag] ciflow/trunk/167555 -> ciflow/trunk/167555 2025-12-04T09:17:11.2576289Z * [new tag] ciflow/trunk/167599 -> ciflow/trunk/167599 2025-12-04T09:17:11.2578181Z * [new tag] ciflow/trunk/167659 -> ciflow/trunk/167659 2025-12-04T09:17:11.2579334Z * [new tag] ciflow/trunk/167672 -> ciflow/trunk/167672 2025-12-04T09:17:11.2580922Z * [new tag] ciflow/trunk/167742 -> ciflow/trunk/167742 2025-12-04T09:17:11.2582001Z * [new tag] ciflow/trunk/167781 -> ciflow/trunk/167781 2025-12-04T09:17:11.2583961Z * [new tag] ciflow/trunk/167837 -> ciflow/trunk/167837 2025-12-04T09:17:11.2585010Z * [new tag] ciflow/trunk/167887 -> ciflow/trunk/167887 2025-12-04T09:17:11.2586389Z * [new tag] ciflow/trunk/167978 -> ciflow/trunk/167978 2025-12-04T09:17:11.2587701Z * [new tag] ciflow/trunk/168050 -> ciflow/trunk/168050 2025-12-04T09:17:11.2588868Z * [new tag] ciflow/trunk/168051 -> ciflow/trunk/168051 2025-12-04T09:17:11.2590538Z * [new tag] ciflow/trunk/168096 -> ciflow/trunk/168096 2025-12-04T09:17:11.2591493Z * [new tag] ciflow/trunk/168127 -> ciflow/trunk/168127 2025-12-04T09:17:11.2593173Z * [new tag] ciflow/trunk/168157 -> ciflow/trunk/168157 2025-12-04T09:17:11.2594254Z * [new tag] ciflow/trunk/168175 -> ciflow/trunk/168175 2025-12-04T09:17:11.2595830Z * [new tag] ciflow/trunk/168209 -> ciflow/trunk/168209 2025-12-04T09:17:11.2597000Z * [new tag] ciflow/trunk/168213 -> ciflow/trunk/168213 2025-12-04T09:17:11.2598725Z * [new tag] ciflow/trunk/168226 -> ciflow/trunk/168226 2025-12-04T09:17:11.2599811Z * [new tag] ciflow/trunk/168262 -> ciflow/trunk/168262 2025-12-04T09:17:11.2601204Z * [new tag] ciflow/trunk/168275 -> ciflow/trunk/168275 2025-12-04T09:17:11.2602919Z * [new tag] ciflow/trunk/168328 -> ciflow/trunk/168328 2025-12-04T09:17:11.2603913Z * [new tag] ciflow/trunk/168368 -> ciflow/trunk/168368 2025-12-04T09:17:11.2605512Z * [new tag] ciflow/trunk/168917 -> ciflow/trunk/168917 2025-12-04T09:17:11.2606539Z * [new tag] ciflow/trunk/168933 -> ciflow/trunk/168933 2025-12-04T09:17:11.2608268Z * [new tag] ciflow/trunk/168941 -> ciflow/trunk/168941 2025-12-04T09:17:11.2609329Z * [new tag] ciflow/trunk/168955 -> ciflow/trunk/168955 2025-12-04T09:17:11.2610918Z * [new tag] ciflow/trunk/168980 -> ciflow/trunk/168980 2025-12-04T09:17:11.2612289Z * [new tag] ciflow/trunk/169004 -> ciflow/trunk/169004 2025-12-04T09:17:11.2613422Z * [new tag] ciflow/trunk/169006 -> ciflow/trunk/169006 2025-12-04T09:17:11.2615033Z * [new tag] ciflow/trunk/169023 -> ciflow/trunk/169023 2025-12-04T09:17:11.2616093Z * [new tag] ciflow/trunk/169025 -> ciflow/trunk/169025 2025-12-04T09:17:11.2617673Z * [new tag] ciflow/trunk/169048 -> ciflow/trunk/169048 2025-12-04T09:17:11.2618727Z * [new tag] ciflow/trunk/169066 -> ciflow/trunk/169066 2025-12-04T09:17:11.2620307Z * [new tag] ciflow/trunk/169091 -> ciflow/trunk/169091 2025-12-04T09:17:11.2621403Z * [new tag] ciflow/trunk/169102 -> ciflow/trunk/169102 2025-12-04T09:17:11.2623008Z * [new tag] ciflow/trunk/169103 -> ciflow/trunk/169103 2025-12-04T09:17:11.2624800Z * [new tag] ciflow/trunk/169125 -> ciflow/trunk/169125 2025-12-04T09:17:11.2625935Z * [new tag] ciflow/trunk/169139 -> ciflow/trunk/169139 2025-12-04T09:17:11.2627820Z * [new tag] ciflow/trunk/169148 -> ciflow/trunk/169148 2025-12-04T09:17:11.2628728Z * [new tag] ciflow/trunk/169151 -> ciflow/trunk/169151 2025-12-04T09:17:11.2630403Z * [new tag] ciflow/trunk/169156 -> ciflow/trunk/169156 2025-12-04T09:17:11.2631623Z * [new tag] ciflow/trunk/169176 -> ciflow/trunk/169176 2025-12-04T09:17:11.2633261Z * [new tag] ciflow/trunk/169204 -> ciflow/trunk/169204 2025-12-04T09:17:11.2634235Z * [new tag] ciflow/trunk/169207 -> ciflow/trunk/169207 2025-12-04T09:17:11.2635806Z * [new tag] ciflow/trunk/169211 -> ciflow/trunk/169211 2025-12-04T09:17:11.2637386Z * [new tag] ciflow/trunk/169231 -> ciflow/trunk/169231 2025-12-04T09:17:11.2638763Z * [new tag] ciflow/trunk/169260 -> ciflow/trunk/169260 2025-12-04T09:17:11.2640454Z * [new tag] ciflow/trunk/169271 -> ciflow/trunk/169271 2025-12-04T09:17:11.2641479Z * [new tag] ciflow/trunk/169280 -> ciflow/trunk/169280 2025-12-04T09:17:11.2643072Z * [new tag] ciflow/trunk/169281 -> ciflow/trunk/169281 2025-12-04T09:17:11.2644196Z * [new tag] ciflow/trunk/169286 -> ciflow/trunk/169286 2025-12-04T09:17:11.2645923Z * [new tag] ciflow/trunk/169293 -> ciflow/trunk/169293 2025-12-04T09:17:11.2646984Z * [new tag] ciflow/trunk/169296 -> ciflow/trunk/169296 2025-12-04T09:17:11.2648559Z * [new tag] ciflow/trunk/169304 -> ciflow/trunk/169304 2025-12-04T09:17:11.2649664Z * [new tag] ciflow/trunk/169305 -> ciflow/trunk/169305 2025-12-04T09:17:11.2651218Z * [new tag] ciflow/trunk/169312 -> ciflow/trunk/169312 2025-12-04T09:17:11.2652872Z * [new tag] ciflow/trunk/169328 -> ciflow/trunk/169328 2025-12-04T09:17:11.2653953Z * [new tag] ciflow/trunk/169343 -> ciflow/trunk/169343 2025-12-04T09:17:11.2655625Z * [new tag] ciflow/trunk/169355 -> ciflow/trunk/169355 2025-12-04T09:17:11.2656649Z * [new tag] ciflow/trunk/169370 -> ciflow/trunk/169370 2025-12-04T09:17:11.2658455Z * [new tag] ciflow/trunk/169379 -> ciflow/trunk/169379 2025-12-04T09:17:11.2659438Z * [new tag] ciflow/trunk/169380 -> ciflow/trunk/169380 2025-12-04T09:17:11.2661085Z * [new tag] ciflow/trunk/169385 -> ciflow/trunk/169385 2025-12-04T09:17:11.2662120Z * [new tag] ciflow/trunk/169387 -> ciflow/trunk/169387 2025-12-04T09:17:11.2664089Z * [new tag] ciflow/trunk/169410 -> ciflow/trunk/169410 2025-12-04T09:17:11.2665160Z * [new tag] ciflow/trunk/169412 -> ciflow/trunk/169412 2025-12-04T09:17:11.2667213Z * [new tag] ciflow/trunk/169418 -> ciflow/trunk/169418 2025-12-04T09:17:11.2668180Z * [new tag] ciflow/trunk/169423 -> ciflow/trunk/169423 2025-12-04T09:17:11.2669799Z * [new tag] ciflow/trunk/169427 -> ciflow/trunk/169427 2025-12-04T09:17:11.2670808Z * [new tag] ciflow/trunk/169430 -> ciflow/trunk/169430 2025-12-04T09:17:11.2672393Z * [new tag] ciflow/trunk/169437 -> ciflow/trunk/169437 2025-12-04T09:17:11.2673405Z * [new tag] ciflow/trunk/169442 -> ciflow/trunk/169442 2025-12-04T09:17:11.2675013Z * [new tag] ciflow/trunk/169452 -> ciflow/trunk/169452 2025-12-04T09:17:11.2676066Z * [new tag] ciflow/trunk/169454 -> ciflow/trunk/169454 2025-12-04T09:17:11.2677646Z * [new tag] ciflow/trunk/169459 -> ciflow/trunk/169459 2025-12-04T09:17:11.2678843Z * [new tag] ciflow/trunk/169474 -> ciflow/trunk/169474 2025-12-04T09:17:11.2680434Z * [new tag] ciflow/trunk/169475 -> ciflow/trunk/169475 2025-12-04T09:17:11.2681471Z * [new tag] ciflow/trunk/169476 -> ciflow/trunk/169476 2025-12-04T09:17:11.2683347Z * [new tag] ciflow/trunk/169487 -> ciflow/trunk/169487 2025-12-04T09:17:11.2684378Z * [new tag] ciflow/trunk/169497 -> ciflow/trunk/169497 2025-12-04T09:17:11.2685974Z * [new tag] ciflow/trunk/169503 -> ciflow/trunk/169503 2025-12-04T09:17:11.2687071Z * [new tag] ciflow/trunk/169505 -> ciflow/trunk/169505 2025-12-04T09:17:11.2688653Z * [new tag] ciflow/trunk/169507 -> ciflow/trunk/169507 2025-12-04T09:17:11.2689694Z * [new tag] ciflow/trunk/169514 -> ciflow/trunk/169514 2025-12-04T09:17:11.2691385Z * [new tag] ciflow/trunk/169517 -> ciflow/trunk/169517 2025-12-04T09:17:11.2692322Z * [new tag] ciflow/trunk/169519 -> ciflow/trunk/169519 2025-12-04T09:17:11.2693892Z * [new tag] ciflow/trunk/169528 -> ciflow/trunk/169528 2025-12-04T09:17:11.2694957Z * [new tag] ciflow/trunk/169541 -> ciflow/trunk/169541 2025-12-04T09:17:11.2696758Z * [new tag] ciflow/trunk/169555 -> ciflow/trunk/169555 2025-12-04T09:17:11.2698509Z * [new tag] ciflow/unstable/123 -> ciflow/unstable/123 2025-12-04T09:17:11.2700127Z * [new tag] ciflow/vllm/165270 -> ciflow/vllm/165270 2025-12-04T09:17:11.2701083Z * [new tag] ciflow/vllm/165274 -> ciflow/vllm/165274 2025-12-04T09:17:11.2702459Z * [new tag] ciflow/vllm/166494 -> ciflow/vllm/166494 2025-12-04T09:17:11.2703911Z * [new tag] ciflow/vllm/169219 -> ciflow/vllm/169219 2025-12-04T09:17:11.2704967Z * [new tag] ciflow/vllm/169220 -> ciflow/vllm/169220 2025-12-04T09:17:11.2706815Z * [new tag] ciflow/xpu/157994 -> ciflow/xpu/157994 2025-12-04T09:17:11.2707812Z * [new tag] ciflow/xpu/159718 -> ciflow/xpu/159718 2025-12-04T09:17:11.2708985Z * [new tag] ciflow/xpu/161940 -> ciflow/xpu/161940 2025-12-04T09:17:11.2710695Z * [new tag] ciflow/xpu/163251 -> ciflow/xpu/163251 2025-12-04T09:17:11.2711660Z * [new tag] ciflow/xpu/166829 -> ciflow/xpu/166829 2025-12-04T09:17:11.2713214Z * [new tag] ciflow/xpu/166843 -> ciflow/xpu/166843 2025-12-04T09:17:11.2714172Z * [new tag] ciflow/xpu/167972 -> ciflow/xpu/167972 2025-12-04T09:17:11.2715731Z * [new tag] ciflow/xpu/167981 -> ciflow/xpu/167981 2025-12-04T09:17:11.2716727Z * [new tag] ciflow/xpu/168213 -> ciflow/xpu/168213 2025-12-04T09:17:11.2717863Z * [new tag] ciflow/xpu/168262 -> ciflow/xpu/168262 2025-12-04T09:17:11.2719494Z * [new tag] ciflow/xpu/168328 -> ciflow/xpu/168328 2025-12-04T09:17:11.2721145Z * [new tag] ciflow/xpu/168950 -> ciflow/xpu/168950 2025-12-04T09:17:11.2723030Z * [new tag] ciflow/xpu/169039 -> ciflow/xpu/169039 2025-12-04T09:17:11.2725494Z * [new tag] ciflow/xpu/169200 -> ciflow/xpu/169200 2025-12-04T09:17:11.2728293Z * [new tag] ciflow/xpu/169203 -> ciflow/xpu/169203 2025-12-04T09:17:11.2729345Z * [new tag] ciflow/xpu/169230 -> ciflow/xpu/169230 2025-12-04T09:17:11.2730934Z * [new tag] ciflow/xpu/169231 -> ciflow/xpu/169231 2025-12-04T09:17:11.2732317Z * [new tag] ciflow/xpu/169241 -> ciflow/xpu/169241 2025-12-04T09:17:11.2733509Z * [new tag] ciflow/xpu/169280 -> ciflow/xpu/169280 2025-12-04T09:17:11.2735173Z * [new tag] ciflow/xpu/169296 -> ciflow/xpu/169296 2025-12-04T09:17:11.2736326Z * [new tag] ciflow/xpu/169353 -> ciflow/xpu/169353 2025-12-04T09:17:11.2737958Z * [new tag] ciflow/xpu/169410 -> ciflow/xpu/169410 2025-12-04T09:17:11.2739113Z * [new tag] ciflow/xpu/169442 -> ciflow/xpu/169442 2025-12-04T09:17:11.2740832Z * [new tag] ciflow/xpu/169555 -> ciflow/xpu/169555 2025-12-04T09:17:11.2741929Z * [new tag] cslpull75 -> cslpull75 2025-12-04T09:17:11.2744070Z * [new tag] cslpull76 -> cslpull76 2025-12-04T09:17:11.2745108Z * [new tag] cslpull77 -> cslpull77 2025-12-04T09:17:11.2746804Z * [new tag] cslpull78 -> cslpull78 2025-12-04T09:17:11.2748433Z * [new tag] cslpull79 -> cslpull79 2025-12-04T09:17:11.2749590Z * [new tag] cslpull80 -> cslpull80 2025-12-04T09:17:11.2751273Z * [new tag] cslpull81 -> cslpull81 2025-12-04T09:17:11.2752420Z * [new tag] cslpull82 -> cslpull82 2025-12-04T09:17:11.2754024Z * [new tag] cslpull83 -> cslpull83 2025-12-04T09:17:11.2755148Z * [new tag] cslpull84 -> cslpull84 2025-12-04T09:17:11.2756759Z * [new tag] cslpull85 -> cslpull85 2025-12-04T09:17:11.2757939Z * [new tag] cslpull86 -> cslpull86 2025-12-04T09:17:11.2759590Z * [new tag] cslpull87 -> cslpull87 2025-12-04T09:17:11.2760712Z * [new tag] cslpull88 -> cslpull88 2025-12-04T09:17:11.2762378Z * [new tag] cslpull89 -> cslpull89 2025-12-04T09:17:11.2763298Z * [new tag] cslpull90 -> cslpull90 2025-12-04T09:17:11.2765312Z * [new tag] cslpull91 -> cslpull91 2025-12-04T09:17:11.2766406Z * [new tag] cslpull92 -> cslpull92 2025-12-04T09:17:11.2768116Z * [new tag] flight_5 -> flight_5 2025-12-04T09:17:11.2769704Z * [new tag] flight_5.1 -> flight_5.1 2025-12-04T09:17:11.2770980Z * [new tag] flight_5.2 -> flight_5.2 2025-12-04T09:17:11.2772661Z * [new tag] flight_5.3 -> flight_5.3 2025-12-04T09:17:11.2773812Z * [new tag] forpull1 -> forpull1 2025-12-04T09:17:11.2775781Z * [new tag] malfet/tag-2ef5611 -> malfet/tag-2ef5611 2025-12-04T09:17:11.2776907Z * [new tag] malfet/tag-317b1a0 -> malfet/tag-317b1a0 2025-12-04T09:17:11.2778563Z * [new tag] malfet/tag-ec6f767 -> malfet/tag-ec6f767 2025-12-04T09:17:11.2779786Z * [new tag] nightly-binary -> nightly-binary 2025-12-04T09:17:11.2781474Z * [new tag] sqzhang_flight4_plus -> sqzhang_flight4_plus 2025-12-04T09:17:11.2783249Z * [new tag] sqzhang_flight_3 -> sqzhang_flight_3 2025-12-04T09:17:11.2785105Z * [new tag] trunk/02d8bd6974cf84b721680d773dbdb1b6f40ce272 -> trunk/02d8bd6974cf84b721680d773dbdb1b6f40ce272 2025-12-04T09:17:11.2786181Z * [new tag] trunk/066997fb38ade71e00d78e9d572e380b5f02bd3e -> trunk/066997fb38ade71e00d78e9d572e380b5f02bd3e 2025-12-04T09:17:11.2788199Z * [new tag] trunk/076e7b19fa1d481ad778d06d2b49ba57d3ce8c88 -> trunk/076e7b19fa1d481ad778d06d2b49ba57d3ce8c88 2025-12-04T09:17:11.2789561Z * [new tag] trunk/07dcc0b83db3211653a38565a24e15acdba75654 -> trunk/07dcc0b83db3211653a38565a24e15acdba75654 2025-12-04T09:17:11.2791068Z * [new tag] trunk/082e96b68dfcd16cab7cfafc4d3d055767dab3eb -> trunk/082e96b68dfcd16cab7cfafc4d3d055767dab3eb 2025-12-04T09:17:11.2792541Z * [new tag] trunk/088048f2fea28ff7d450f65c72419ca45780d30b -> trunk/088048f2fea28ff7d450f65c72419ca45780d30b 2025-12-04T09:17:11.2793952Z * [new tag] trunk/09076941a95c76f4d9ad189d064dfd8baa39e672 -> trunk/09076941a95c76f4d9ad189d064dfd8baa39e672 2025-12-04T09:17:11.2795372Z * [new tag] trunk/0b80a4c62b94402844bf221791c096b0035c6d75 -> trunk/0b80a4c62b94402844bf221791c096b0035c6d75 2025-12-04T09:17:11.2797479Z * [new tag] trunk/0bbbdf1750567a980634ad907a325357ba8ba8f2 -> trunk/0bbbdf1750567a980634ad907a325357ba8ba8f2 2025-12-04T09:17:11.2798737Z * [new tag] trunk/0c281dd78773b2bc17c58ead0e4cd4ac46e775c5 -> trunk/0c281dd78773b2bc17c58ead0e4cd4ac46e775c5 2025-12-04T09:17:11.2799970Z * [new tag] trunk/135f3753c418a6879b1954904184937b67e61688 -> trunk/135f3753c418a6879b1954904184937b67e61688 2025-12-04T09:17:11.2801496Z * [new tag] trunk/15da21026cb13cd20257dc9e96830db108743c10 -> trunk/15da21026cb13cd20257dc9e96830db108743c10 2025-12-04T09:17:11.2803014Z * [new tag] trunk/166efdad2ac827f30fb02504c6017520257f88ec -> trunk/166efdad2ac827f30fb02504c6017520257f88ec 2025-12-04T09:17:11.2804424Z * [new tag] trunk/174272c15fae553d8488140af931f7d8050a313f -> trunk/174272c15fae553d8488140af931f7d8050a313f 2025-12-04T09:17:11.2806207Z * [new tag] trunk/18f3ca08f13b8de61307f5e8cd7d4cccb67e9d11 -> trunk/18f3ca08f13b8de61307f5e8cd7d4cccb67e9d11 2025-12-04T09:17:11.2807531Z * [new tag] trunk/1902eddfe655a15ebcf2c72bd81ade110fdeef63 -> trunk/1902eddfe655a15ebcf2c72bd81ade110fdeef63 2025-12-04T09:17:11.2809066Z * [new tag] trunk/195f92e98d3d66738577f11f22c4b5c8a1c76dd5 -> trunk/195f92e98d3d66738577f11f22c4b5c8a1c76dd5 2025-12-04T09:17:11.2810498Z * [new tag] trunk/1aa13e17de39e3c768ea7aebaad166ce72a06676 -> trunk/1aa13e17de39e3c768ea7aebaad166ce72a06676 2025-12-04T09:17:11.2811939Z * [new tag] trunk/1afe2832f58e24e54a5bfda5a5afa9b96fdea40e -> trunk/1afe2832f58e24e54a5bfda5a5afa9b96fdea40e 2025-12-04T09:17:11.2813344Z * [new tag] trunk/1c87554d74140eaee964ca8b1832cede67f5f520 -> trunk/1c87554d74140eaee964ca8b1832cede67f5f520 2025-12-04T09:17:11.2814821Z * [new tag] trunk/1ccb743b7b5be955f49736c162c4f5004b8a0dd8 -> trunk/1ccb743b7b5be955f49736c162c4f5004b8a0dd8 2025-12-04T09:17:11.2816783Z * [new tag] trunk/1cee47d6ce0a02227185b566593f002dd639ca0c -> trunk/1cee47d6ce0a02227185b566593f002dd639ca0c 2025-12-04T09:17:11.2817827Z * [new tag] trunk/1d21b4df2babe322e5d085ceb6de884eb260a62d -> trunk/1d21b4df2babe322e5d085ceb6de884eb260a62d 2025-12-04T09:17:11.2819237Z * [new tag] trunk/1e34fb2550e4aa650314f7a6d9f6daf4da7478a8 -> trunk/1e34fb2550e4aa650314f7a6d9f6daf4da7478a8 2025-12-04T09:17:11.2820787Z * [new tag] trunk/1e526fb5b1d93bfc70691c5c3955fdffc1b7b7de -> trunk/1e526fb5b1d93bfc70691c5c3955fdffc1b7b7de 2025-12-04T09:17:11.2822213Z * [new tag] trunk/1ee32a8b1f554a312d79bad01ded24f38cd95543 -> trunk/1ee32a8b1f554a312d79bad01ded24f38cd95543 2025-12-04T09:17:11.2823755Z * [new tag] trunk/201e2c4117eb9744594dad6a5c18213d7b4705d7 -> trunk/201e2c4117eb9744594dad6a5c18213d7b4705d7 2025-12-04T09:17:11.2825395Z * [new tag] trunk/2353a0f60eb4b4cb6675907a7fa9fbedc1c02e7f -> trunk/2353a0f60eb4b4cb6675907a7fa9fbedc1c02e7f 2025-12-04T09:17:11.2827613Z * [new tag] trunk/285779b1621cf9f073a062b0889a642d200308d9 -> trunk/285779b1621cf9f073a062b0889a642d200308d9 2025-12-04T09:17:11.2828449Z * [new tag] trunk/2887faaec6295d081580d09fce161201826c6d87 -> trunk/2887faaec6295d081580d09fce161201826c6d87 2025-12-04T09:17:11.2829906Z * [new tag] trunk/296e67c92635443c67b11c0ae1bd045f03ebb7bc -> trunk/296e67c92635443c67b11c0ae1bd045f03ebb7bc 2025-12-04T09:17:11.2831331Z * [new tag] trunk/29856679769b3dede478767e2fe6cfb51197cb25 -> trunk/29856679769b3dede478767e2fe6cfb51197cb25 2025-12-04T09:17:11.2832839Z * [new tag] trunk/29e5455a4740c326ab187c7aa7b5ef98034ea563 -> trunk/29e5455a4740c326ab187c7aa7b5ef98034ea563 2025-12-04T09:17:11.2834299Z * [new tag] trunk/2ac3ef882afb23136adc188975f0a8802fc68adf -> trunk/2ac3ef882afb23136adc188975f0a8802fc68adf 2025-12-04T09:17:11.2835591Z * [new tag] trunk/2bec68e73b64715354af076ad309335f943e36cd -> trunk/2bec68e73b64715354af076ad309335f943e36cd 2025-12-04T09:17:11.2837017Z * [new tag] trunk/2c87367e6f88662cd5cedbd1537748b7948c38e1 -> trunk/2c87367e6f88662cd5cedbd1537748b7948c38e1 2025-12-04T09:17:11.2838653Z * [new tag] trunk/2d1f78fe3ec13820f136a2e0336da12a25f41708 -> trunk/2d1f78fe3ec13820f136a2e0336da12a25f41708 2025-12-04T09:17:11.2840398Z * [new tag] trunk/2df6058f116a65722a0e03073402feb242572d35 -> trunk/2df6058f116a65722a0e03073402feb242572d35 2025-12-04T09:17:11.2841915Z * [new tag] trunk/2e0c2e170fe658c440775c8e5c44228aafcc47ec -> trunk/2e0c2e170fe658c440775c8e5c44228aafcc47ec 2025-12-04T09:17:11.2843551Z * [new tag] trunk/2f9b7dad7b5419b063bd0f2e204de192720ebb94 -> trunk/2f9b7dad7b5419b063bd0f2e204de192720ebb94 2025-12-04T09:17:11.2844655Z * [new tag] trunk/305168768a95d69c444df5cd334bb774edfe06f1 -> trunk/305168768a95d69c444df5cd334bb774edfe06f1 2025-12-04T09:17:11.2846352Z * [new tag] trunk/31fc12773026e8e00f054dd79ad9b2491e693b48 -> trunk/31fc12773026e8e00f054dd79ad9b2491e693b48 2025-12-04T09:17:11.2847881Z * [new tag] trunk/320de0c6b0a3e7c6d2693ea5c28d5d0156ba7991 -> trunk/320de0c6b0a3e7c6d2693ea5c28d5d0156ba7991 2025-12-04T09:17:11.2849356Z * [new tag] trunk/3418bd29475dff06695045fcdf93e7d0dac67da8 -> trunk/3418bd29475dff06695045fcdf93e7d0dac67da8 2025-12-04T09:17:11.2850774Z * [new tag] trunk/34a98608afa0cb5b48f0d6d30432fdd0a2614ddf -> trunk/34a98608afa0cb5b48f0d6d30432fdd0a2614ddf 2025-12-04T09:17:11.2852255Z * [new tag] trunk/35b7a9a26c5923d98aebaa41a031dae21788a9ee -> trunk/35b7a9a26c5923d98aebaa41a031dae21788a9ee 2025-12-04T09:17:11.2853551Z * [new tag] trunk/39d07dbf03a911bdd45d1af78d8638dc92074938 -> trunk/39d07dbf03a911bdd45d1af78d8638dc92074938 2025-12-04T09:17:11.2855038Z * [new tag] trunk/3cd98b4205ada151042cc7ff097a82d4a4b18725 -> trunk/3cd98b4205ada151042cc7ff097a82d4a4b18725 2025-12-04T09:17:11.2856337Z * [new tag] trunk/3d35fd20a78ff4d016fa80f4e5fad37191d7bcae -> trunk/3d35fd20a78ff4d016fa80f4e5fad37191d7bcae 2025-12-04T09:17:11.2857931Z * [new tag] trunk/409a5fee945c46a3edaf5df162812f201bfd7b2f -> trunk/409a5fee945c46a3edaf5df162812f201bfd7b2f 2025-12-04T09:17:11.2859396Z * [new tag] trunk/42e9005cda22da3f1c559c3649218cebd671027c -> trunk/42e9005cda22da3f1c559c3649218cebd671027c 2025-12-04T09:17:11.2860902Z * [new tag] trunk/43b94713bbf340d3c124fde02d0f73add4021247 -> trunk/43b94713bbf340d3c124fde02d0f73add4021247 2025-12-04T09:17:11.2862604Z * [new tag] trunk/44ac69388a4a5eb463dbd2a13f00d1e3b924566c -> trunk/44ac69388a4a5eb463dbd2a13f00d1e3b924566c 2025-12-04T09:17:11.2863758Z * [new tag] trunk/45d14e2497292be06ad36eaa1aaaf7c630a2586a -> trunk/45d14e2497292be06ad36eaa1aaaf7c630a2586a 2025-12-04T09:17:11.2865463Z * [new tag] trunk/45d310ad84854dff730c0b12e577d7998d978686 -> trunk/45d310ad84854dff730c0b12e577d7998d978686 2025-12-04T09:17:11.2867127Z * [new tag] trunk/47b28ddf7bd74b50fa93b307a7d3b183a6d77f54 -> trunk/47b28ddf7bd74b50fa93b307a7d3b183a6d77f54 2025-12-04T09:17:11.2868186Z * [new tag] trunk/481e5ab336275bd3acd5fa8a611b05b4469012af -> trunk/481e5ab336275bd3acd5fa8a611b05b4469012af 2025-12-04T09:17:11.2869919Z * [new tag] trunk/491731647f6b8a9345dcfb3bc9416aea254a7d96 -> trunk/491731647f6b8a9345dcfb3bc9416aea254a7d96 2025-12-04T09:17:11.2871958Z * [new tag] trunk/49a04d26088acc17d948ddd66920f3e16371e873 -> trunk/49a04d26088acc17d948ddd66920f3e16371e873 2025-12-04T09:17:11.2873368Z * [new tag] trunk/4bebc827c47d2f1f0fa1a417a5201a97aef3d985 -> trunk/4bebc827c47d2f1f0fa1a417a5201a97aef3d985 2025-12-04T09:17:11.2874424Z * [new tag] trunk/4c246677784c6a14bc2dbb9ff8773ef0a3a3222f -> trunk/4c246677784c6a14bc2dbb9ff8773ef0a3a3222f 2025-12-04T09:17:11.2876349Z * [new tag] trunk/4cfb47ff548b6d996641058cf04a70e311a4c3aa -> trunk/4cfb47ff548b6d996641058cf04a70e311a4c3aa 2025-12-04T09:17:11.2878032Z * [new tag] trunk/4e0061c1aa52f606dda8cfab0bd7591e588faf2c -> trunk/4e0061c1aa52f606dda8cfab0bd7591e588faf2c 2025-12-04T09:17:11.2879908Z * [new tag] trunk/4fefb8e7e942386ffac764a41b232241f82bea3a -> trunk/4fefb8e7e942386ffac764a41b232241f82bea3a 2025-12-04T09:17:11.2881360Z * [new tag] trunk/503b2640023521f5a35cd9a52fc8033d73a95d0d -> trunk/503b2640023521f5a35cd9a52fc8033d73a95d0d 2025-12-04T09:17:11.2882610Z * [new tag] trunk/518c2b1b3dab9a2ef2849e04b3bc2f20c1c41db9 -> trunk/518c2b1b3dab9a2ef2849e04b3bc2f20c1c41db9 2025-12-04T09:17:11.2884365Z * [new tag] trunk/5191b2fa68ba19960912bfd7fd721c79d76bb1f3 -> trunk/5191b2fa68ba19960912bfd7fd721c79d76bb1f3 2025-12-04T09:17:11.2885917Z * [new tag] trunk/52ac0f0dc4acacd219f1317fbc28ec631c01e07a -> trunk/52ac0f0dc4acacd219f1317fbc28ec631c01e07a 2025-12-04T09:17:11.2887553Z * [new tag] trunk/539ba711b029de9f191070f4f0d12f18f5b7f292 -> trunk/539ba711b029de9f191070f4f0d12f18f5b7f292 2025-12-04T09:17:11.2888602Z * [new tag] trunk/556375b55deebebbc56cb7aef81f4d52f031ba28 -> trunk/556375b55deebebbc56cb7aef81f4d52f031ba28 2025-12-04T09:17:11.2890404Z * [new tag] trunk/55c4ab554845481d0a69a3811937575fe8bb1a66 -> trunk/55c4ab554845481d0a69a3811937575fe8bb1a66 2025-12-04T09:17:11.2891901Z * [new tag] trunk/5634469fda9e5d98869c82c7d03bb08914245f96 -> trunk/5634469fda9e5d98869c82c7d03bb08914245f96 2025-12-04T09:17:11.2892957Z * [new tag] trunk/5778f6ff894686a975a9a23645178ae4c87ad5dc -> trunk/5778f6ff894686a975a9a23645178ae4c87ad5dc 2025-12-04T09:17:11.2894764Z * [new tag] trunk/587d63a3e07de5dc91065f9ef70bcacda9989068 -> trunk/587d63a3e07de5dc91065f9ef70bcacda9989068 2025-12-04T09:17:11.2896112Z * [new tag] trunk/597930f6b568852356ca9795dac76f9e4653adbd -> trunk/597930f6b568852356ca9795dac76f9e4653adbd 2025-12-04T09:17:11.2897216Z * [new tag] trunk/597df3a4e2a67b9fdbe1a89b2f4d74f822274db6 -> trunk/597df3a4e2a67b9fdbe1a89b2f4d74f822274db6 2025-12-04T09:17:11.2899099Z * [new tag] trunk/59abd50e931f4efb21b053f7a2911f5d8a49d883 -> trunk/59abd50e931f4efb21b053f7a2911f5d8a49d883 2025-12-04T09:17:11.2900628Z * [new tag] trunk/5a607febc04c3a2b5824c75f3f60307867439a2c -> trunk/5a607febc04c3a2b5824c75f3f60307867439a2c 2025-12-04T09:17:11.2902325Z * [new tag] trunk/5bf1cdf4755c54ef462b44cb8041b0a57311556b -> trunk/5bf1cdf4755c54ef462b44cb8041b0a57311556b 2025-12-04T09:17:11.2903439Z * [new tag] trunk/5f0030ba63d334d7e8c93a09e41403b89e4c573c -> trunk/5f0030ba63d334d7e8c93a09e41403b89e4c573c 2025-12-04T09:17:11.2905186Z * [new tag] trunk/5f21d27e71268464d362a96c9ac09ea475f7f202 -> trunk/5f21d27e71268464d362a96c9ac09ea475f7f202 2025-12-04T09:17:11.2906719Z * [new tag] trunk/5fafc13038c9988d9ac21fa793fbd5890604b447 -> trunk/5fafc13038c9988d9ac21fa793fbd5890604b447 2025-12-04T09:17:11.2927542Z * [new tag] trunk/61be54a31dc09b59d99b62176fb935aee0b924ef -> trunk/61be54a31dc09b59d99b62176fb935aee0b924ef 2025-12-04T09:17:11.2928128Z * [new tag] trunk/62d3ccd71484ed6a760d909b41487101bbc65719 -> trunk/62d3ccd71484ed6a760d909b41487101bbc65719 2025-12-04T09:17:11.2928688Z * [new tag] trunk/641cdb68ae27668eb441d0e49c87a0602c120c2b -> trunk/641cdb68ae27668eb441d0e49c87a0602c120c2b 2025-12-04T09:17:11.2929068Z * [new tag] trunk/65c4620d6bb0c6029f69762c22b91dda2294da9a -> trunk/65c4620d6bb0c6029f69762c22b91dda2294da9a 2025-12-04T09:17:11.2929415Z * [new tag] trunk/66004b993744b4106bf8afaba71f3c228a804206 -> trunk/66004b993744b4106bf8afaba71f3c228a804206 2025-12-04T09:17:11.2929766Z * [new tag] trunk/6658a04c7ca67acb64512341342e7b3ee13ee386 -> trunk/6658a04c7ca67acb64512341342e7b3ee13ee386 2025-12-04T09:17:11.2930116Z * [new tag] trunk/6864e309092a71f8ab0ca6a4dc7f8a4073fd31c4 -> trunk/6864e309092a71f8ab0ca6a4dc7f8a4073fd31c4 2025-12-04T09:17:11.2930640Z * [new tag] trunk/6c261c6cb07892c90ca19ed51c9705b1659a3f7d -> trunk/6c261c6cb07892c90ca19ed51c9705b1659a3f7d 2025-12-04T09:17:11.2930991Z * [new tag] trunk/6c8b6a043f1628188b6396b3a2a6e000ca68362b -> trunk/6c8b6a043f1628188b6396b3a2a6e000ca68362b 2025-12-04T09:17:11.2931464Z * [new tag] trunk/6ceb4a32f92ae67ce5d7d97931d17401ebf5ffa5 -> trunk/6ceb4a32f92ae67ce5d7d97931d17401ebf5ffa5 2025-12-04T09:17:11.2931815Z * [new tag] trunk/6e404e9b7d6f5fb0de86aa73888c3038248c17f8 -> trunk/6e404e9b7d6f5fb0de86aa73888c3038248c17f8 2025-12-04T09:17:11.2932181Z * [new tag] trunk/6ec30b490aee1db6bcdc7340abddef25784f08ec -> trunk/6ec30b490aee1db6bcdc7340abddef25784f08ec 2025-12-04T09:17:11.2932535Z * [new tag] trunk/6f2783a6c08e1db34275ff25176ffe9aebc30a71 -> trunk/6f2783a6c08e1db34275ff25176ffe9aebc30a71 2025-12-04T09:17:11.2932904Z * [new tag] trunk/6f53fefeb90ad3281119b5cfc4aa9ffd8a066e3d -> trunk/6f53fefeb90ad3281119b5cfc4aa9ffd8a066e3d 2025-12-04T09:17:11.2933280Z * [new tag] trunk/6f7dcf51e46d0c880db1a2f5c70de57adb576f4a -> trunk/6f7dcf51e46d0c880db1a2f5c70de57adb576f4a 2025-12-04T09:17:11.2933637Z * [new tag] trunk/6ff831180d2fa436c7f1c1af3adac641fce9d60e -> trunk/6ff831180d2fa436c7f1c1af3adac641fce9d60e 2025-12-04T09:17:11.2933993Z * [new tag] trunk/70076464a63ab218a7ceefb0e76ccd7131deb8f8 -> trunk/70076464a63ab218a7ceefb0e76ccd7131deb8f8 2025-12-04T09:17:11.2934347Z * [new tag] trunk/70d797a5fc109b20a517646fcaa819477cd0d485 -> trunk/70d797a5fc109b20a517646fcaa819477cd0d485 2025-12-04T09:17:11.2935151Z * [new tag] trunk/7348cb355ff0a6f79cd4871215aea72185748734 -> trunk/7348cb355ff0a6f79cd4871215aea72185748734 2025-12-04T09:17:11.2936835Z * [new tag] trunk/74fe26a1ebe32931783569f2e762e3c2c974901f -> trunk/74fe26a1ebe32931783569f2e762e3c2c974901f 2025-12-04T09:17:11.2938560Z * [new tag] trunk/76aeb8c7e0f795b3fddca134cbea9a69da3ee696 -> trunk/76aeb8c7e0f795b3fddca134cbea9a69da3ee696 2025-12-04T09:17:11.2939379Z * [new tag] trunk/7716da9fb23f27a65b41f9f016a2afadf281c18f -> trunk/7716da9fb23f27a65b41f9f016a2afadf281c18f 2025-12-04T09:17:11.2941171Z * [new tag] trunk/7741edd4ed665f3988052e260863efb508d61a03 -> trunk/7741edd4ed665f3988052e260863efb508d61a03 2025-12-04T09:17:11.2942773Z * [new tag] trunk/78adb3b3df41b45d2368b67226d2f864b78939a6 -> trunk/78adb3b3df41b45d2368b67226d2f864b78939a6 2025-12-04T09:17:11.2944416Z * [new tag] trunk/79d7b178225e5ed24d4e1db74e5abbff848f5fb7 -> trunk/79d7b178225e5ed24d4e1db74e5abbff848f5fb7 2025-12-04T09:17:11.2945427Z * [new tag] trunk/7a1e316115fc6996b3f2336822ba5d5f6179f0c3 -> trunk/7a1e316115fc6996b3f2336822ba5d5f6179f0c3 2025-12-04T09:17:11.2947061Z * [new tag] trunk/7a41b66367c38d0af3e8a90f7be48d6b281e7bca -> trunk/7a41b66367c38d0af3e8a90f7be48d6b281e7bca 2025-12-04T09:17:11.2948596Z * [new tag] trunk/7b7af390ea8541c611d1ce2018a6934188fc197b -> trunk/7b7af390ea8541c611d1ce2018a6934188fc197b 2025-12-04T09:17:11.2950276Z * [new tag] trunk/7ba4680f3755a560af81aa0f688791e367aa3609 -> trunk/7ba4680f3755a560af81aa0f688791e367aa3609 2025-12-04T09:17:11.2951679Z * [new tag] trunk/7bc2a66ded06a0b2549aa51d807edc5dc3e73d1b -> trunk/7bc2a66ded06a0b2549aa51d807edc5dc3e73d1b 2025-12-04T09:17:11.2952658Z * [new tag] trunk/7c648509a7470ace9fb2bae960dd4790f7e943e9 -> trunk/7c648509a7470ace9fb2bae960dd4790f7e943e9 2025-12-04T09:17:11.2954513Z * [new tag] trunk/7cbc2d034cecd21ab5c9707d0a9c525c17143fb8 -> trunk/7cbc2d034cecd21ab5c9707d0a9c525c17143fb8 2025-12-04T09:17:11.2955585Z * [new tag] trunk/7d1bbaf4ba301ea3fba6f3c7bc02d58f6417aaed -> trunk/7d1bbaf4ba301ea3fba6f3c7bc02d58f6417aaed 2025-12-04T09:17:11.2957390Z * [new tag] trunk/7d2a33e4ebf60b217a3cd77feae19231eb996fc8 -> trunk/7d2a33e4ebf60b217a3cd77feae19231eb996fc8 2025-12-04T09:17:11.2959908Z * [new tag] trunk/7eb625920054b1126a7d2d99818aaa188c6ba95e -> trunk/7eb625920054b1126a7d2d99818aaa188c6ba95e 2025-12-04T09:17:11.2960294Z * [new tag] trunk/7f55ba19c456a3d6cc443dd9edb6bb7cca677ead -> trunk/7f55ba19c456a3d6cc443dd9edb6bb7cca677ead 2025-12-04T09:17:11.2961441Z * [new tag] trunk/81af382128efa094d8702e18f2c133760904c718 -> trunk/81af382128efa094d8702e18f2c133760904c718 2025-12-04T09:17:11.2963237Z * [new tag] trunk/84149583d483e9c973c9a0feda70e4f3964947b0 -> trunk/84149583d483e9c973c9a0feda70e4f3964947b0 2025-12-04T09:17:11.2964865Z * [new tag] trunk/85a315917efe82c24306be805c584ec044951c75 -> trunk/85a315917efe82c24306be805c584ec044951c75 2025-12-04T09:17:11.2966380Z * [new tag] trunk/87329491c82a5f8c1cc4ec11d8f55a5de2551ece -> trunk/87329491c82a5f8c1cc4ec11d8f55a5de2551ece 2025-12-04T09:17:11.2968204Z * [new tag] trunk/892640e25aeefa8007c5af837214b4502b6b62a6 -> trunk/892640e25aeefa8007c5af837214b4502b6b62a6 2025-12-04T09:17:11.2969946Z * [new tag] trunk/89e3bbcb5b5321dc8b9520b4d5a8ee60cea1d0b4 -> trunk/89e3bbcb5b5321dc8b9520b4d5a8ee60cea1d0b4 2025-12-04T09:17:11.2971012Z * [new tag] trunk/8c73bbbb02159223c0c97d268a0a74cb78158a1c -> trunk/8c73bbbb02159223c0c97d268a0a74cb78158a1c 2025-12-04T09:17:11.2972874Z * [new tag] trunk/8d56e98c8db988a22cb2dfaeefb30bc7d2a3cc43 -> trunk/8d56e98c8db988a22cb2dfaeefb30bc7d2a3cc43 2025-12-04T09:17:11.2974535Z * [new tag] trunk/8d9dd9603e5ee26c01007f0cd4f018e584840922 -> trunk/8d9dd9603e5ee26c01007f0cd4f018e584840922 2025-12-04T09:17:11.2975611Z * [new tag] trunk/8ef0c0b02b062d75e7c9be2594914a3e784d23ca -> trunk/8ef0c0b02b062d75e7c9be2594914a3e784d23ca 2025-12-04T09:17:11.2977419Z * [new tag] trunk/90b27e7e8352cde97d32ddad24740ef819633f38 -> trunk/90b27e7e8352cde97d32ddad24740ef819633f38 2025-12-04T09:17:11.2978502Z * [new tag] trunk/90f0139e64b2951815d524b6a373bed20c4fbf90 -> trunk/90f0139e64b2951815d524b6a373bed20c4fbf90 2025-12-04T09:17:11.2980115Z * [new tag] trunk/93d0d6838c56af59b0dba794e6aa08f0c1c7799c -> trunk/93d0d6838c56af59b0dba794e6aa08f0c1c7799c 2025-12-04T09:17:11.2981686Z * [new tag] trunk/94ca8d5f1e81fea3ae488650a0fb6795049a9f87 -> trunk/94ca8d5f1e81fea3ae488650a0fb6795049a9f87 2025-12-04T09:17:11.2983111Z * [new tag] trunk/9844fbeadd5cebdf1281d6fbf79164139c352693 -> trunk/9844fbeadd5cebdf1281d6fbf79164139c352693 2025-12-04T09:17:11.2984952Z * [new tag] trunk/99024dec888ec1e50b546822a32b6fb2f35e5eaa -> trunk/99024dec888ec1e50b546822a32b6fb2f35e5eaa 2025-12-04T09:17:11.2986004Z * [new tag] trunk/9a296e640fc88aa44d275b48cd9cc30c573b169d -> trunk/9a296e640fc88aa44d275b48cd9cc30c573b169d 2025-12-04T09:17:11.2987815Z * [new tag] trunk/9b3e34d8589b29f7b4e7fab6f78711b7ca6e4639 -> trunk/9b3e34d8589b29f7b4e7fab6f78711b7ca6e4639 2025-12-04T09:17:11.2989248Z * [new tag] trunk/9cd055e547e9b67a5f9827f8999c38d7eda1bcb8 -> trunk/9cd055e547e9b67a5f9827f8999c38d7eda1bcb8 2025-12-04T09:17:11.2990746Z * [new tag] trunk/9f0df5686cb4ada94f94620acba2e3c3f363b11d -> trunk/9f0df5686cb4ada94f94620acba2e3c3f363b11d 2025-12-04T09:17:11.2992265Z * [new tag] trunk/9f7fceb887d0cfa0326a59b887821c63ff11340a -> trunk/9f7fceb887d0cfa0326a59b887821c63ff11340a 2025-12-04T09:17:11.2993764Z * [new tag] trunk/9f8ef8855d3078d70f7b782540ff2aaf158d6742 -> trunk/9f8ef8855d3078d70f7b782540ff2aaf158d6742 2025-12-04T09:17:11.2995377Z * [new tag] trunk/9fb52efc797b47a1f425a03aa5e47b866d8b1098 -> trunk/9fb52efc797b47a1f425a03aa5e47b866d8b1098 2025-12-04T09:17:11.2996860Z * [new tag] trunk/9ff4a2ebc5762d46c73e46b1b523d7ff349fedfa -> trunk/9ff4a2ebc5762d46c73e46b1b523d7ff349fedfa 2025-12-04T09:17:11.2998640Z * [new tag] trunk/a0f3937b94422354538ebbd47202d5b0e8a3fd0d -> trunk/a0f3937b94422354538ebbd47202d5b0e8a3fd0d 2025-12-04T09:17:11.3000087Z * [new tag] trunk/a15066c28b3145e6edbfc88359d0411d14cfc70c -> trunk/a15066c28b3145e6edbfc88359d0411d14cfc70c 2025-12-04T09:17:11.3001614Z * [new tag] trunk/a20f775e82564d2a9979221ed7f3b8d7cf54ce90 -> trunk/a20f775e82564d2a9979221ed7f3b8d7cf54ce90 2025-12-04T09:17:11.3003070Z * [new tag] trunk/a2973fb00ec002dd4b6bbf07385f066efb259b8c -> trunk/a2973fb00ec002dd4b6bbf07385f066efb259b8c 2025-12-04T09:17:11.3004160Z * [new tag] trunk/a7dc6dab9ad911259d4801c502907e531594db45 -> trunk/a7dc6dab9ad911259d4801c502907e531594db45 2025-12-04T09:17:11.3005993Z * [new tag] trunk/a951a9cee65c01660bbc6e6fded90ecb10fa6109 -> trunk/a951a9cee65c01660bbc6e6fded90ecb10fa6109 2025-12-04T09:17:11.3007552Z * [new tag] trunk/abfa1a6d65c7c159e35c72c25979b9da4971689e -> trunk/abfa1a6d65c7c159e35c72c25979b9da4971689e 2025-12-04T09:17:11.3009099Z * [new tag] trunk/ae3a2395bf66151078e2d201716f7d63ce1c6f3e -> trunk/ae3a2395bf66151078e2d201716f7d63ce1c6f3e 2025-12-04T09:17:11.3010192Z * [new tag] trunk/afdff7f0325080dedac44d080cb5a3b0e65e6c5e -> trunk/afdff7f0325080dedac44d080cb5a3b0e65e6c5e 2025-12-04T09:17:11.3011796Z * [new tag] trunk/b1aed4e7a72c03a38f44543aaea0dae2e9b76d48 -> trunk/b1aed4e7a72c03a38f44543aaea0dae2e9b76d48 2025-12-04T09:17:11.3013395Z * [new tag] trunk/b1decff555cd50e2123c8c6e25cc0d447c411f62 -> trunk/b1decff555cd50e2123c8c6e25cc0d447c411f62 2025-12-04T09:17:11.3015002Z * [new tag] trunk/b2b6b034c9fd08672c40e63ef243556ad4c49bd2 -> trunk/b2b6b034c9fd08672c40e63ef243556ad4c49bd2 2025-12-04T09:17:11.3016487Z * [new tag] trunk/b39813b4a04931682b0491adba2138d01d716d99 -> trunk/b39813b4a04931682b0491adba2138d01d716d99 2025-12-04T09:17:11.3018029Z * [new tag] trunk/b3a7edb2311367974cc7cd764cfb11a5d6758b24 -> trunk/b3a7edb2311367974cc7cd764cfb11a5d6758b24 2025-12-04T09:17:11.3019786Z * [new tag] trunk/b4cc1329c86acaef6d42c1fac7169b8d870ab0d7 -> trunk/b4cc1329c86acaef6d42c1fac7169b8d870ab0d7 2025-12-04T09:17:11.3020925Z * [new tag] trunk/b555c39217f765759954a4f9f9bd1e9b87bed11a -> trunk/b555c39217f765759954a4f9f9bd1e9b87bed11a 2025-12-04T09:17:11.3022809Z * [new tag] trunk/b6b6c80379388b7f9932c3e6a0f9907bf430e417 -> trunk/b6b6c80379388b7f9932c3e6a0f9907bf430e417 2025-12-04T09:17:11.3024677Z * [new tag] trunk/b6b6d912df0b6f4082f8e50b18bd1de1dd7325f4 -> trunk/b6b6d912df0b6f4082f8e50b18bd1de1dd7325f4 2025-12-04T09:17:11.3026311Z * [new tag] trunk/b7d60685f8cbc939b68a20871e90db67e729329b -> trunk/b7d60685f8cbc939b68a20871e90db67e729329b 2025-12-04T09:17:11.3027751Z * [new tag] trunk/b7f6b9a4fc6259f7af068f31868b3119bb1bac3e -> trunk/b7f6b9a4fc6259f7af068f31868b3119bb1bac3e 2025-12-04T09:17:11.3029485Z * [new tag] trunk/b8c4ba3593761e7b2a3ebd86f040fb07b47c02cf -> trunk/b8c4ba3593761e7b2a3ebd86f040fb07b47c02cf 2025-12-04T09:17:11.3030668Z * [new tag] trunk/b9c8f3a4884befb965ff42620ce44a71b04887f5 -> trunk/b9c8f3a4884befb965ff42620ce44a71b04887f5 2025-12-04T09:17:11.3032570Z * [new tag] trunk/ba1412546f3082c0958c077acc2025e4dbc33f1f -> trunk/ba1412546f3082c0958c077acc2025e4dbc33f1f 2025-12-04T09:17:11.3034142Z * [new tag] trunk/bac403c0b38c63bdbcc0c31f1c2b0bc0260f610f -> trunk/bac403c0b38c63bdbcc0c31f1c2b0bc0260f610f 2025-12-04T09:17:11.3035607Z * [new tag] trunk/bb3034198b459401fabeab254e1b99f0115046e2 -> trunk/bb3034198b459401fabeab254e1b99f0115046e2 2025-12-04T09:17:11.3037175Z * [new tag] trunk/bc39b2b3bc7a6e19a42e62bd576974035086fe55 -> trunk/bc39b2b3bc7a6e19a42e62bd576974035086fe55 2025-12-04T09:17:11.3039134Z * [new tag] trunk/bc43d5b297f207a11d83d77ddf0152bdaabe15a8 -> trunk/bc43d5b297f207a11d83d77ddf0152bdaabe15a8 2025-12-04T09:17:11.3040188Z * [new tag] trunk/bc6a4863c7246a6493d16d4ea6eee71ec07c6a09 -> trunk/bc6a4863c7246a6493d16d4ea6eee71ec07c6a09 2025-12-04T09:17:11.3041958Z * [new tag] trunk/bea4912944defdbcb8b061800caab6cbbbd01df5 -> trunk/bea4912944defdbcb8b061800caab6cbbbd01df5 2025-12-04T09:17:11.3043768Z * [new tag] trunk/c04e2c656f48d82d1521b867bbbf03967b9b7564 -> trunk/c04e2c656f48d82d1521b867bbbf03967b9b7564 2025-12-04T09:17:11.3045464Z * [new tag] trunk/c0660bcee27e7d7731634e274576a7081882bede -> trunk/c0660bcee27e7d7731634e274576a7081882bede 2025-12-04T09:17:11.3046941Z * [new tag] trunk/c178ed43d3d99cbefe84fbfb21d6f282b20d62ac -> trunk/c178ed43d3d99cbefe84fbfb21d6f282b20d62ac 2025-12-04T09:17:11.3048416Z * [new tag] trunk/c55b1e8f61d041ee436d697449eb028931d574fb -> trunk/c55b1e8f61d041ee436d697449eb028931d574fb 2025-12-04T09:17:11.3049566Z * [new tag] trunk/c6ae7579fe12fe75f1a8f7043a494c90567273f1 -> trunk/c6ae7579fe12fe75f1a8f7043a494c90567273f1 2025-12-04T09:17:11.3051592Z * [new tag] trunk/c8210e7d94bad5ae21ac389fa4ba8a463c76c4d0 -> trunk/c8210e7d94bad5ae21ac389fa4ba8a463c76c4d0 2025-12-04T09:17:11.3053192Z * [new tag] trunk/cc0853af42122f8185321f542616f4474e717f09 -> trunk/cc0853af42122f8185321f542616f4474e717f09 2025-12-04T09:17:11.3054310Z * [new tag] trunk/cddec6562eabfa390d014fa3741a5659cf9c94c9 -> trunk/cddec6562eabfa390d014fa3741a5659cf9c94c9 2025-12-04T09:17:11.3056145Z * [new tag] trunk/ce5e7e3bf1f4b69a4f4f93d288ba75b906df492a -> trunk/ce5e7e3bf1f4b69a4f4f93d288ba75b906df492a 2025-12-04T09:17:11.3057618Z * [new tag] trunk/d038b0130ec7c20ebcac219301292fd8e98a1ace -> trunk/d038b0130ec7c20ebcac219301292fd8e98a1ace 2025-12-04T09:17:11.3059312Z * [new tag] trunk/d16447dacaf2420ea175f0c275c75da951f57d39 -> trunk/d16447dacaf2420ea175f0c275c75da951f57d39 2025-12-04T09:17:11.3060810Z * [new tag] trunk/d19f1e8cab6810bb2e99141f9976665954c67a50 -> trunk/d19f1e8cab6810bb2e99141f9976665954c67a50 2025-12-04T09:17:11.3062475Z * [new tag] trunk/d1c9f03b2a5af4104721712f8cdffe9b4f340c01 -> trunk/d1c9f03b2a5af4104721712f8cdffe9b4f340c01 2025-12-04T09:17:11.3064171Z * [new tag] trunk/d40f4950f2b7f7aa380a22fe0f6166e71680fbcf -> trunk/d40f4950f2b7f7aa380a22fe0f6166e71680fbcf 2025-12-04T09:17:11.3065668Z * [new tag] trunk/d5038950bacfe36bbf24a47a455fe76901deb8e8 -> trunk/d5038950bacfe36bbf24a47a455fe76901deb8e8 2025-12-04T09:17:11.3067555Z * [new tag] trunk/d54ff42903c2ae0533931ff11d23b35f875bdb3d -> trunk/d54ff42903c2ae0533931ff11d23b35f875bdb3d 2025-12-04T09:17:11.3069130Z * [new tag] trunk/d76697633a2d2b9cced1ae21161849b33bfe7e47 -> trunk/d76697633a2d2b9cced1ae21161849b33bfe7e47 2025-12-04T09:17:11.3070643Z * [new tag] trunk/d78f52b199c547106d4cd9d2856dd0805c118bf1 -> trunk/d78f52b199c547106d4cd9d2856dd0805c118bf1 2025-12-04T09:17:11.3072175Z * [new tag] trunk/d8fd5c6eed28e5004150691d048a3f6785e19a8e -> trunk/d8fd5c6eed28e5004150691d048a3f6785e19a8e 2025-12-04T09:17:11.3073731Z * [new tag] trunk/d900f5e86745dec76713f4b0ef07005ef36b2f5a -> trunk/d900f5e86745dec76713f4b0ef07005ef36b2f5a 2025-12-04T09:17:11.3075252Z * [new tag] trunk/d973dc6b87d763859fe1c5bd1287e3b6b1c49d1b -> trunk/d973dc6b87d763859fe1c5bd1287e3b6b1c49d1b 2025-12-04T09:17:11.3076885Z * [new tag] trunk/d998c03304cb6ede76e1ed535b4ddeb6c2bf40ec -> trunk/d998c03304cb6ede76e1ed535b4ddeb6c2bf40ec 2025-12-04T09:17:11.3078442Z * [new tag] trunk/d9cb8a70833101dbbe16b99520cfbdd70d0a87bf -> trunk/d9cb8a70833101dbbe16b99520cfbdd70d0a87bf 2025-12-04T09:17:11.3079699Z * [new tag] trunk/d9d5e91b43f70eb8637af55db6856d49be391ffd -> trunk/d9d5e91b43f70eb8637af55db6856d49be391ffd 2025-12-04T09:17:11.3081380Z * [new tag] trunk/dd18a75336a4fbd7497955cc5665904724fce889 -> trunk/dd18a75336a4fbd7497955cc5665904724fce889 2025-12-04T09:17:11.3082942Z * [new tag] trunk/ded9bcd61a059bf723e6e84689552962b480ea77 -> trunk/ded9bcd61a059bf723e6e84689552962b480ea77 2025-12-04T09:17:11.3084755Z * [new tag] trunk/dfbd3714d15c37a7b83b322a6b60f997fc00f50c -> trunk/dfbd3714d15c37a7b83b322a6b60f997fc00f50c 2025-12-04T09:17:11.3086364Z * [new tag] trunk/e115f9f4e4b039f8e9a642aaa2bd8254a920541b -> trunk/e115f9f4e4b039f8e9a642aaa2bd8254a920541b 2025-12-04T09:17:11.3087396Z * [new tag] trunk/e3f24fd73ad74c6e7176687986436956c7c18235 -> trunk/e3f24fd73ad74c6e7176687986436956c7c18235 2025-12-04T09:17:11.3089309Z * [new tag] trunk/e7d24d3ff93d1503ba63860b7057438ad93f918e -> trunk/e7d24d3ff93d1503ba63860b7057438ad93f918e 2025-12-04T09:17:11.3090901Z * [new tag] trunk/ea7035f462a0d2830865ee86c832bd101e1427fc -> trunk/ea7035f462a0d2830865ee86c832bd101e1427fc 2025-12-04T09:17:11.3092429Z * [new tag] trunk/eabb7ad2128580ef674446027b95bcf4e21e8df3 -> trunk/eabb7ad2128580ef674446027b95bcf4e21e8df3 2025-12-04T09:17:11.3094012Z * [new tag] trunk/eb5c63652a33da42e7018c23df5f20a3eb4c6ccf -> trunk/eb5c63652a33da42e7018c23df5f20a3eb4c6ccf 2025-12-04T09:17:11.3095470Z * [new tag] trunk/ec2c71f5c85021b8938cdafadce24c15a36fd93e -> trunk/ec2c71f5c85021b8938cdafadce24c15a36fd93e 2025-12-04T09:17:11.3097063Z * [new tag] trunk/ecbcc3f6bf327856b435b259ac63cc2f328c4b4e -> trunk/ecbcc3f6bf327856b435b259ac63cc2f328c4b4e 2025-12-04T09:17:11.3098992Z * [new tag] trunk/ee87bbe876c42575e961b32a0827d76bc9782ca2 -> trunk/ee87bbe876c42575e961b32a0827d76bc9782ca2 2025-12-04T09:17:11.3100512Z * [new tag] trunk/ef019d1d431c4c5a95b594cb90d40a50cd00f5e4 -> trunk/ef019d1d431c4c5a95b594cb90d40a50cd00f5e4 2025-12-04T09:17:11.3102076Z * [new tag] trunk/ef8ecc13830a86c4b231f1aad9aba7851db61b53 -> trunk/ef8ecc13830a86c4b231f1aad9aba7851db61b53 2025-12-04T09:17:11.3103826Z * [new tag] trunk/f1076f5510920044912247b1abb8760cb820f598 -> trunk/f1076f5510920044912247b1abb8760cb820f598 2025-12-04T09:17:11.3105233Z * [new tag] trunk/f2d6a75a00a1d648ca9a0abc6a33e14c3dea6c40 -> trunk/f2d6a75a00a1d648ca9a0abc6a33e14c3dea6c40 2025-12-04T09:17:11.3106729Z * [new tag] trunk/f47dd0ddef1359e5b43e4b962412f67b30ecde56 -> trunk/f47dd0ddef1359e5b43e4b962412f67b30ecde56 2025-12-04T09:17:11.3108340Z * [new tag] trunk/f49d32dfa4730dcfb1b60eeeb369b5889da983c8 -> trunk/f49d32dfa4730dcfb1b60eeeb369b5889da983c8 2025-12-04T09:17:11.3109895Z * [new tag] trunk/f4dedf78fc30fd4b93975787ca6074ee89db9467 -> trunk/f4dedf78fc30fd4b93975787ca6074ee89db9467 2025-12-04T09:17:11.3111414Z * [new tag] trunk/f7c0d03819ebed05c4038f095d66d1b8c54aca17 -> trunk/f7c0d03819ebed05c4038f095d66d1b8c54aca17 2025-12-04T09:17:11.3112904Z * [new tag] trunk/f7e1bd80a063e17453c361837ba6ea2570920a73 -> trunk/f7e1bd80a063e17453c361837ba6ea2570920a73 2025-12-04T09:17:11.3113994Z * [new tag] trunk/f9bd6c53624c7c0ea3772de78498326e84c2f0e7 -> trunk/f9bd6c53624c7c0ea3772de78498326e84c2f0e7 2025-12-04T09:17:11.3115936Z * [new tag] trunk/fb5be221a46b51bfc9509013b0d85bc5a9d4f15b -> trunk/fb5be221a46b51bfc9509013b0d85bc5a9d4f15b 2025-12-04T09:17:11.3117440Z * [new tag] trunk/fdf863d5e1de3b2688c9511e96876e34581dbfd7 -> trunk/fdf863d5e1de3b2688c9511e96876e34581dbfd7 2025-12-04T09:17:11.3119532Z * [new tag] trunk/fe0e65adfc0e7ca6e5f57e6ea8b16bd5cc967307 -> trunk/fe0e65adfc0e7ca6e5f57e6ea8b16bd5cc967307 2025-12-04T09:17:11.3121079Z * [new tag] trunk/fec710bf89173f5355468a7ce1afe9157c3d9009 -> trunk/fec710bf89173f5355468a7ce1afe9157c3d9009 2025-12-04T09:17:11.3122758Z * [new tag] trunk/ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 -> trunk/ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:17:11.3123577Z * [new tag] v0.1.1 -> v0.1.1 2025-12-04T09:17:11.3125441Z * [new tag] v0.1.10 -> v0.1.10 2025-12-04T09:17:11.3126688Z * [new tag] v0.1.11 -> v0.1.11 2025-12-04T09:17:11.3128595Z * [new tag] v0.1.12 -> v0.1.12 2025-12-04T09:17:11.3129505Z * [new tag] v0.1.2 -> v0.1.2 2025-12-04T09:17:11.3131077Z * [new tag] v0.1.3 -> v0.1.3 2025-12-04T09:17:11.3132469Z * [new tag] v0.1.4 -> v0.1.4 2025-12-04T09:17:11.3133861Z * [new tag] v0.1.5 -> v0.1.5 2025-12-04T09:17:11.3135344Z * [new tag] v0.1.6 -> v0.1.6 2025-12-04T09:17:11.3136684Z * [new tag] v0.1.7 -> v0.1.7 2025-12-04T09:17:11.3137934Z * [new tag] v0.1.8 -> v0.1.8 2025-12-04T09:17:11.3139379Z * [new tag] v0.1.9 -> v0.1.9 2025-12-04T09:17:11.3140830Z * [new tag] v0.2.0 -> v0.2.0 2025-12-04T09:17:11.3142276Z * [new tag] v0.3.0 -> v0.3.0 2025-12-04T09:17:11.3143877Z * [new tag] v0.3.1 -> v0.3.1 2025-12-04T09:17:11.3145311Z * [new tag] v0.4.0 -> v0.4.0 2025-12-04T09:17:11.3146741Z * [new tag] v0.4.1 -> v0.4.1 2025-12-04T09:17:11.3148106Z * [new tag] v1.0.0 -> v1.0.0 2025-12-04T09:17:11.3149552Z * [new tag] v1.0.0a0 -> v1.0.0a0 2025-12-04T09:17:11.3150928Z * [new tag] v1.0.1 -> v1.0.1 2025-12-04T09:17:11.3152553Z * [new tag] v1.0rc0 -> v1.0rc0 2025-12-04T09:17:11.3153343Z * [new tag] v1.0rc1 -> v1.0rc1 2025-12-04T09:17:11.3155039Z * [new tag] v1.1.0 -> v1.1.0 2025-12-04T09:17:11.3156576Z * [new tag] v1.1.0a0 -> v1.1.0a0 2025-12-04T09:17:11.3158189Z * [new tag] v1.10.0 -> v1.10.0 2025-12-04T09:17:11.3159736Z * [new tag] v1.10.0-rc1 -> v1.10.0-rc1 2025-12-04T09:17:11.3161184Z * [new tag] v1.10.0-rc2 -> v1.10.0-rc2 2025-12-04T09:17:11.3162402Z * [new tag] v1.10.0-rc3 -> v1.10.0-rc3 2025-12-04T09:17:11.3163875Z * [new tag] v1.10.1 -> v1.10.1 2025-12-04T09:17:11.3165087Z * [new tag] v1.10.1-rc1 -> v1.10.1-rc1 2025-12-04T09:17:11.3166302Z * [new tag] v1.10.2 -> v1.10.2 2025-12-04T09:17:11.3167521Z * [new tag] v1.10.2-rc1 -> v1.10.2-rc1 2025-12-04T09:17:11.3168996Z * [new tag] v1.11.0 -> v1.11.0 2025-12-04T09:17:11.3170517Z * [new tag] v1.11.0-rc1 -> v1.11.0-rc1 2025-12-04T09:17:11.3172128Z * [new tag] v1.11.0-rc2 -> v1.11.0-rc2 2025-12-04T09:17:11.3173620Z * [new tag] v1.11.0-rc3 -> v1.11.0-rc3 2025-12-04T09:17:11.3175126Z * [new tag] v1.11.0-rc4 -> v1.11.0-rc4 2025-12-04T09:17:11.3176590Z * [new tag] v1.11.0-rc5 -> v1.11.0-rc5 2025-12-04T09:17:11.3177527Z * [new tag] v1.11.0-rc6 -> v1.11.0-rc6 2025-12-04T09:17:11.3179475Z * [new tag] v1.11.0-rc7 -> v1.11.0-rc7 2025-12-04T09:17:11.3181181Z * [new tag] v1.12.0 -> v1.12.0 2025-12-04T09:17:11.3182373Z * [new tag] v1.12.0-rc1 -> v1.12.0-rc1 2025-12-04T09:17:11.3184036Z * [new tag] v1.12.0-rc2 -> v1.12.0-rc2 2025-12-04T09:17:11.3185514Z * [new tag] v1.12.0-rc3 -> v1.12.0-rc3 2025-12-04T09:17:11.3187083Z * [new tag] v1.12.0-rc4 -> v1.12.0-rc4 2025-12-04T09:17:11.3188500Z * [new tag] v1.12.0-rc5 -> v1.12.0-rc5 2025-12-04T09:17:11.3190065Z * [new tag] v1.12.0-rc6 -> v1.12.0-rc6 2025-12-04T09:17:11.3191298Z * [new tag] v1.12.0-rc7 -> v1.12.0-rc7 2025-12-04T09:17:11.3192213Z * [new tag] v1.12.0-rc8 -> v1.12.0-rc8 2025-12-04T09:17:11.3193758Z * [new tag] v1.12.1 -> v1.12.1 2025-12-04T09:17:11.3195279Z * [new tag] v1.12.1-rc1 -> v1.12.1-rc1 2025-12-04T09:17:11.3196724Z * [new tag] v1.12.1-rc2 -> v1.12.1-rc2 2025-12-04T09:17:11.3198274Z * [new tag] v1.12.1-rc3 -> v1.12.1-rc3 2025-12-04T09:17:11.3199698Z * [new tag] v1.12.1-rc4 -> v1.12.1-rc4 2025-12-04T09:17:11.3200913Z * [new tag] v1.12.1-rc5 -> v1.12.1-rc5 2025-12-04T09:17:11.3202468Z * [new tag] v1.13.0 -> v1.13.0 2025-12-04T09:17:11.3203858Z * [new tag] v1.13.0-rc1 -> v1.13.0-rc1 2025-12-04T09:17:11.3205287Z * [new tag] v1.13.0-rc2 -> v1.13.0-rc2 2025-12-04T09:17:11.3206740Z * [new tag] v1.13.0-rc3 -> v1.13.0-rc3 2025-12-04T09:17:11.3208304Z * [new tag] v1.13.0-rc4 -> v1.13.0-rc4 2025-12-04T09:17:11.3209556Z * [new tag] v1.13.0-rc5 -> v1.13.0-rc5 2025-12-04T09:17:11.3210728Z * [new tag] v1.13.0-rc6 -> v1.13.0-rc6 2025-12-04T09:17:11.3212214Z * [new tag] v1.13.1 -> v1.13.1 2025-12-04T09:17:11.3213487Z * [new tag] v1.13.1-rc1 -> v1.13.1-rc1 2025-12-04T09:17:11.3214877Z * [new tag] v1.2.0 -> v1.2.0 2025-12-04T09:17:11.3216326Z * [new tag] v1.2.0a0 -> v1.2.0a0 2025-12-04T09:17:11.3217798Z * [new tag] v1.3.0 -> v1.3.0 2025-12-04T09:17:11.3219196Z * [new tag] v1.3.0a0 -> v1.3.0a0 2025-12-04T09:17:11.3220447Z * [new tag] v1.3.1 -> v1.3.1 2025-12-04T09:17:11.3221827Z * [new tag] v1.4.0 -> v1.4.0 2025-12-04T09:17:11.3223397Z * [new tag] v1.4.0a0 -> v1.4.0a0 2025-12-04T09:17:11.3224891Z * [new tag] v1.4.1 -> v1.4.1 2025-12-04T09:17:11.3228927Z * [new tag] v1.5.0 -> v1.5.0 2025-12-04T09:17:11.3230447Z * [new tag] v1.5.0-rc1 -> v1.5.0-rc1 2025-12-04T09:17:11.3231928Z * [new tag] v1.5.0-rc2 -> v1.5.0-rc2 2025-12-04T09:17:11.3233495Z * [new tag] v1.5.0-rc3 -> v1.5.0-rc3 2025-12-04T09:17:11.3234834Z * [new tag] v1.5.0-rc4 -> v1.5.0-rc4 2025-12-04T09:17:11.3236090Z * [new tag] v1.5.0-rc5 -> v1.5.0-rc5 2025-12-04T09:17:11.3237579Z * [new tag] v1.5.1 -> v1.5.1 2025-12-04T09:17:11.3238825Z * [new tag] v1.5.1-rc1 -> v1.5.1-rc1 2025-12-04T09:17:11.3240021Z * [new tag] v1.6.0 -> v1.6.0 2025-12-04T09:17:11.3241530Z * [new tag] v1.6.0-rc1 -> v1.6.0-rc1 2025-12-04T09:17:11.3243397Z * [new tag] v1.6.0-rc2 -> v1.6.0-rc2 2025-12-04T09:17:11.3244270Z * [new tag] v1.6.0-rc3 -> v1.6.0-rc3 2025-12-04T09:17:11.3245878Z * [new tag] v1.6.0-rc4 -> v1.6.0-rc4 2025-12-04T09:17:11.3247356Z * [new tag] v1.6.0-rc5 -> v1.6.0-rc5 2025-12-04T09:17:11.3248757Z * [new tag] v1.6.0-rc6 -> v1.6.0-rc6 2025-12-04T09:17:11.3250084Z * [new tag] v1.6.0-rc7 -> v1.6.0-rc7 2025-12-04T09:17:11.3251663Z * [new tag] v1.7.0 -> v1.7.0 2025-12-04T09:17:11.3253152Z * [new tag] v1.7.0-rc1 -> v1.7.0-rc1 2025-12-04T09:17:11.3254639Z * [new tag] v1.7.0-rc2 -> v1.7.0-rc2 2025-12-04T09:17:11.3256167Z * [new tag] v1.7.0-rc3 -> v1.7.0-rc3 2025-12-04T09:17:11.3257339Z * [new tag] v1.7.0-rc4 -> v1.7.0-rc4 2025-12-04T09:17:11.3258876Z * [new tag] v1.7.1 -> v1.7.1 2025-12-04T09:17:11.3260523Z * [new tag] v1.7.1-rc1 -> v1.7.1-rc1 2025-12-04T09:17:11.3262027Z * [new tag] v1.7.1-rc2 -> v1.7.1-rc2 2025-12-04T09:17:11.3263423Z * [new tag] v1.7.1-rc3 -> v1.7.1-rc3 2025-12-04T09:17:11.3264874Z * [new tag] v1.8.0 -> v1.8.0 2025-12-04T09:17:11.3266074Z * [new tag] v1.8.0-rc1 -> v1.8.0-rc1 2025-12-04T09:17:11.3267622Z * [new tag] v1.8.0-rc2 -> v1.8.0-rc2 2025-12-04T09:17:11.3269061Z * [new tag] v1.8.0-rc3 -> v1.8.0-rc3 2025-12-04T09:17:11.3270410Z * [new tag] v1.8.0-rc4 -> v1.8.0-rc4 2025-12-04T09:17:11.3271326Z * [new tag] v1.8.0-rc5 -> v1.8.0-rc5 2025-12-04T09:17:11.3273295Z * [new tag] v1.8.1 -> v1.8.1 2025-12-04T09:17:11.3274808Z * [new tag] v1.8.1-rc1 -> v1.8.1-rc1 2025-12-04T09:17:11.3276048Z * [new tag] v1.8.1-rc2 -> v1.8.1-rc2 2025-12-04T09:17:11.3277205Z * [new tag] v1.8.1-rc3 -> v1.8.1-rc3 2025-12-04T09:17:11.3279232Z * [new tag] v1.8.2 -> v1.8.2 2025-12-04T09:17:11.3280484Z * [new tag] v1.8.2-rc1 -> v1.8.2-rc1 2025-12-04T09:17:11.3281952Z * [new tag] v1.9.0 -> v1.9.0 2025-12-04T09:17:11.3283562Z * [new tag] v1.9.0-rc1 -> v1.9.0-rc1 2025-12-04T09:17:11.3284953Z * [new tag] v1.9.0-rc2 -> v1.9.0-rc2 2025-12-04T09:17:11.3286434Z * [new tag] v1.9.0-rc3 -> v1.9.0-rc3 2025-12-04T09:17:11.3287412Z * [new tag] v1.9.0-rc4 -> v1.9.0-rc4 2025-12-04T09:17:11.3289196Z * [new tag] v1.9.1 -> v1.9.1 2025-12-04T09:17:11.3290831Z * [new tag] v1.9.1-rc1 -> v1.9.1-rc1 2025-12-04T09:17:11.3292050Z * [new tag] v1.9.1-rc2 -> v1.9.1-rc2 2025-12-04T09:17:11.3293549Z * [new tag] v2.0.0 -> v2.0.0 2025-12-04T09:17:11.3294923Z * [new tag] v2.0.0-rc1 -> v2.0.0-rc1 2025-12-04T09:17:11.3296531Z * [new tag] v2.0.0-rc2 -> v2.0.0-rc2 2025-12-04T09:17:11.3297996Z * [new tag] v2.0.0-rc3 -> v2.0.0-rc3 2025-12-04T09:17:11.3299435Z * [new tag] v2.0.0-rc4 -> v2.0.0-rc4 2025-12-04T09:17:11.3300870Z * [new tag] v2.0.0-rc5 -> v2.0.0-rc5 2025-12-04T09:17:11.3302267Z * [new tag] v2.0.0-rc6 -> v2.0.0-rc6 2025-12-04T09:17:11.3303772Z * [new tag] v2.0.1 -> v2.0.1 2025-12-04T09:17:11.3305274Z * [new tag] v2.0.1-rc1 -> v2.0.1-rc1 2025-12-04T09:17:11.3306246Z * [new tag] v2.0.1-rc2 -> v2.0.1-rc2 2025-12-04T09:17:11.3307907Z * [new tag] v2.0.1-rc3 -> v2.0.1-rc3 2025-12-04T09:17:11.3309179Z * [new tag] v2.0.1-rc4 -> v2.0.1-rc4 2025-12-04T09:17:11.3311142Z * [new tag] v2.1.0 -> v2.1.0 2025-12-04T09:17:11.3312541Z * [new tag] v2.1.0-rc1 -> v2.1.0-rc1 2025-12-04T09:17:11.3314041Z * [new tag] v2.1.0-rc2 -> v2.1.0-rc2 2025-12-04T09:17:11.3315607Z * [new tag] v2.1.0-rc3 -> v2.1.0-rc3 2025-12-04T09:17:11.3317128Z * [new tag] v2.1.0-rc4 -> v2.1.0-rc4 2025-12-04T09:17:11.3318617Z * [new tag] v2.1.0-rc5 -> v2.1.0-rc5 2025-12-04T09:17:11.3319746Z * [new tag] v2.1.0-rc6 -> v2.1.0-rc6 2025-12-04T09:17:11.3321420Z * [new tag] v2.1.1 -> v2.1.1 2025-12-04T09:17:11.3322962Z * [new tag] v2.1.1-rc1 -> v2.1.1-rc1 2025-12-04T09:17:11.3324780Z * [new tag] v2.1.1-rc2 -> v2.1.1-rc2 2025-12-04T09:17:11.3326173Z * [new tag] v2.1.1-rc3 -> v2.1.1-rc3 2025-12-04T09:17:11.3327655Z * [new tag] v2.1.1-rc4 -> v2.1.1-rc4 2025-12-04T09:17:11.3329249Z * [new tag] v2.1.1-rc5 -> v2.1.1-rc5 2025-12-04T09:17:11.3330508Z * [new tag] v2.1.1-rc6 -> v2.1.1-rc6 2025-12-04T09:17:11.3331943Z * [new tag] v2.1.2 -> v2.1.2 2025-12-04T09:17:11.3333433Z * [new tag] v2.1.2-rc1 -> v2.1.2-rc1 2025-12-04T09:17:11.3334972Z * [new tag] v2.1.2-rc2 -> v2.1.2-rc2 2025-12-04T09:17:11.3336240Z * [new tag] v2.1.2-rc3 -> v2.1.2-rc3 2025-12-04T09:17:11.3337724Z * [new tag] v2.2.0 -> v2.2.0 2025-12-04T09:17:11.3339157Z * [new tag] v2.2.0-rc1 -> v2.2.0-rc1 2025-12-04T09:17:11.3340619Z * [new tag] v2.2.0-rc2 -> v2.2.0-rc2 2025-12-04T09:17:11.3342038Z * [new tag] v2.2.0-rc3 -> v2.2.0-rc3 2025-12-04T09:17:11.3343722Z * [new tag] v2.2.0-rc4 -> v2.2.0-rc4 2025-12-04T09:17:11.3345045Z * [new tag] v2.2.0-rc5 -> v2.2.0-rc5 2025-12-04T09:17:11.3346543Z * [new tag] v2.2.0-rc6 -> v2.2.0-rc6 2025-12-04T09:17:11.3347800Z * [new tag] v2.2.0-rc7 -> v2.2.0-rc7 2025-12-04T09:17:11.3348954Z * [new tag] v2.2.0-rc8 -> v2.2.0-rc8 2025-12-04T09:17:11.3350405Z * [new tag] v2.2.1 -> v2.2.1 2025-12-04T09:17:11.3351951Z * [new tag] v2.2.1-rc1 -> v2.2.1-rc1 2025-12-04T09:17:11.3353627Z * [new tag] v2.2.1-rc2 -> v2.2.1-rc2 2025-12-04T09:17:11.3354849Z * [new tag] v2.2.1-rc3 -> v2.2.1-rc3 2025-12-04T09:17:11.3356068Z * [new tag] v2.2.2 -> v2.2.2 2025-12-04T09:17:11.3357673Z * [new tag] v2.2.2-rc1 -> v2.2.2-rc1 2025-12-04T09:17:11.3358956Z * [new tag] v2.2.2-rc2 -> v2.2.2-rc2 2025-12-04T09:17:11.3360191Z * [new tag] v2.2.2-rc3 -> v2.2.2-rc3 2025-12-04T09:17:11.3361788Z * [new tag] v2.3.0 -> v2.3.0 2025-12-04T09:17:11.3363098Z * [new tag] v2.3.0-rc1 -> v2.3.0-rc1 2025-12-04T09:17:11.3364639Z * [new tag] v2.3.0-rc10 -> v2.3.0-rc10 2025-12-04T09:17:11.3366983Z * [new tag] v2.3.0-rc11 -> v2.3.0-rc11 2025-12-04T09:17:11.3368008Z * [new tag] v2.3.0-rc12 -> v2.3.0-rc12 2025-12-04T09:17:11.3369706Z * [new tag] v2.3.0-rc2 -> v2.3.0-rc2 2025-12-04T09:17:11.3371242Z * [new tag] v2.3.0-rc3 -> v2.3.0-rc3 2025-12-04T09:17:11.3372714Z * [new tag] v2.3.0-rc4 -> v2.3.0-rc4 2025-12-04T09:17:11.3374127Z * [new tag] v2.3.0-rc5 -> v2.3.0-rc5 2025-12-04T09:17:11.3375408Z * [new tag] v2.3.0-rc6 -> v2.3.0-rc6 2025-12-04T09:17:11.3376896Z * [new tag] v2.3.0-rc7 -> v2.3.0-rc7 2025-12-04T09:17:11.3378356Z * [new tag] v2.3.0-rc8 -> v2.3.0-rc8 2025-12-04T09:17:11.3379536Z * [new tag] v2.3.0-rc9 -> v2.3.0-rc9 2025-12-04T09:17:11.3380830Z * [new tag] v2.3.1 -> v2.3.1 2025-12-04T09:17:11.3382338Z * [new tag] v2.3.1-rc1 -> v2.3.1-rc1 2025-12-04T09:17:11.3384001Z * [new tag] v2.3.1-rc2 -> v2.3.1-rc2 2025-12-04T09:17:11.3385512Z * [new tag] v2.3.1-rc3 -> v2.3.1-rc3 2025-12-04T09:17:11.3386968Z * [new tag] v2.4.0 -> v2.4.0 2025-12-04T09:17:11.3388444Z * [new tag] v2.4.0-rc1 -> v2.4.0-rc1 2025-12-04T09:17:11.3389940Z * [new tag] v2.4.0-rc2 -> v2.4.0-rc2 2025-12-04T09:17:11.3391401Z * [new tag] v2.4.0-rc3 -> v2.4.0-rc3 2025-12-04T09:17:11.3393035Z * [new tag] v2.4.0-rc4 -> v2.4.0-rc4 2025-12-04T09:17:11.3394381Z * [new tag] v2.4.0-rc5 -> v2.4.0-rc5 2025-12-04T09:17:11.3395911Z * [new tag] v2.4.0-rc6 -> v2.4.0-rc6 2025-12-04T09:17:11.3397377Z * [new tag] v2.4.0-rc7 -> v2.4.0-rc7 2025-12-04T09:17:11.3398799Z * [new tag] v2.4.0-rc8 -> v2.4.0-rc8 2025-12-04T09:17:11.3400317Z * [new tag] v2.4.0-rc9 -> v2.4.0-rc9 2025-12-04T09:17:11.3401660Z * [new tag] v2.4.1 -> v2.4.1 2025-12-04T09:17:11.3403133Z * [new tag] v2.4.1-rc1 -> v2.4.1-rc1 2025-12-04T09:17:11.3404598Z * [new tag] v2.4.1-rc2 -> v2.4.1-rc2 2025-12-04T09:17:11.3406101Z * [new tag] v2.4.1-rc3 -> v2.4.1-rc3 2025-12-04T09:17:11.3407599Z * [new tag] v2.5.0 -> v2.5.0 2025-12-04T09:17:11.3409011Z * [new tag] v2.5.0-rc1 -> v2.5.0-rc1 2025-12-04T09:17:11.3410239Z * [new tag] v2.5.0-rc10 -> v2.5.0-rc10 2025-12-04T09:17:11.3411599Z * [new tag] v2.5.0-rc2 -> v2.5.0-rc2 2025-12-04T09:17:11.3413076Z * [new tag] v2.5.0-rc3 -> v2.5.0-rc3 2025-12-04T09:17:11.3414533Z * [new tag] v2.5.0-rc4 -> v2.5.0-rc4 2025-12-04T09:17:11.3416053Z * [new tag] v2.5.0-rc5 -> v2.5.0-rc5 2025-12-04T09:17:11.3417614Z * [new tag] v2.5.0-rc6 -> v2.5.0-rc6 2025-12-04T09:17:11.3419096Z * [new tag] v2.5.0-rc7 -> v2.5.0-rc7 2025-12-04T09:17:11.3420583Z * [new tag] v2.5.0-rc8 -> v2.5.0-rc8 2025-12-04T09:17:11.3422154Z * [new tag] v2.5.0-rc9 -> v2.5.0-rc9 2025-12-04T09:17:11.3423402Z * [new tag] v2.5.1 -> v2.5.1 2025-12-04T09:17:11.3424953Z * [new tag] v2.5.1-rc1 -> v2.5.1-rc1 2025-12-04T09:17:11.3425825Z * [new tag] v2.6.0 -> v2.6.0 2025-12-04T09:17:11.3427584Z * [new tag] v2.6.0-rc1 -> v2.6.0-rc1 2025-12-04T09:17:11.3429086Z * [new tag] v2.6.0-rc2 -> v2.6.0-rc2 2025-12-04T09:17:11.3430612Z * [new tag] v2.6.0-rc3 -> v2.6.0-rc3 2025-12-04T09:17:11.3431947Z * [new tag] v2.6.0-rc4 -> v2.6.0-rc4 2025-12-04T09:17:11.3433684Z * [new tag] v2.6.0-rc5 -> v2.6.0-rc5 2025-12-04T09:17:11.3435285Z * [new tag] v2.6.0-rc6 -> v2.6.0-rc6 2025-12-04T09:17:11.3436952Z * [new tag] v2.6.0-rc7 -> v2.6.0-rc7 2025-12-04T09:17:11.3438448Z * [new tag] v2.6.0-rc8 -> v2.6.0-rc8 2025-12-04T09:17:11.3439999Z * [new tag] v2.6.0-rc9 -> v2.6.0-rc9 2025-12-04T09:17:11.3441635Z * [new tag] v2.7.0 -> v2.7.0 2025-12-04T09:17:11.3443113Z * [new tag] v2.7.0-rc1 -> v2.7.0-rc1 2025-12-04T09:17:11.3444370Z * [new tag] v2.7.0-rc10 -> v2.7.0-rc10 2025-12-04T09:17:11.3445932Z * [new tag] v2.7.0-rc2 -> v2.7.0-rc2 2025-12-04T09:17:11.3447408Z * [new tag] v2.7.0-rc3 -> v2.7.0-rc3 2025-12-04T09:17:11.3448942Z * [new tag] v2.7.0-rc4 -> v2.7.0-rc4 2025-12-04T09:17:11.3450377Z * [new tag] v2.7.0-rc5 -> v2.7.0-rc5 2025-12-04T09:17:11.3451813Z * [new tag] v2.7.0-rc6 -> v2.7.0-rc6 2025-12-04T09:17:11.3453294Z * [new tag] v2.7.0-rc7 -> v2.7.0-rc7 2025-12-04T09:17:11.3454884Z * [new tag] v2.7.0-rc8 -> v2.7.0-rc8 2025-12-04T09:17:11.3456380Z * [new tag] v2.7.0-rc9 -> v2.7.0-rc9 2025-12-04T09:17:11.3457631Z * [new tag] v2.7.1 -> v2.7.1 2025-12-04T09:17:11.3459187Z * [new tag] v2.7.1-rc1 -> v2.7.1-rc1 2025-12-04T09:17:11.3461154Z * [new tag] v2.7.1-rc2 -> v2.7.1-rc2 2025-12-04T09:17:11.3462805Z * [new tag] v2.7.1-rc3 -> v2.7.1-rc3 2025-12-04T09:17:11.3464389Z * [new tag] v2.7.1-rc4 -> v2.7.1-rc4 2025-12-04T09:17:11.3465920Z * [new tag] v2.7.1-rc5 -> v2.7.1-rc5 2025-12-04T09:17:11.3467189Z * [new tag] v2.8.0 -> v2.8.0 2025-12-04T09:17:11.3468872Z * [new tag] v2.8.0-rc1 -> v2.8.0-rc1 2025-12-04T09:17:11.3470320Z * [new tag] v2.8.0-rc2 -> v2.8.0-rc2 2025-12-04T09:17:11.3471862Z * [new tag] v2.8.0-rc3 -> v2.8.0-rc3 2025-12-04T09:17:11.3473488Z * [new tag] v2.8.0-rc4 -> v2.8.0-rc4 2025-12-04T09:17:11.3474967Z * [new tag] v2.8.0-rc5 -> v2.8.0-rc5 2025-12-04T09:17:11.3476603Z * [new tag] v2.8.0-rc6 -> v2.8.0-rc6 2025-12-04T09:17:11.3478101Z * [new tag] v2.8.0-rc7 -> v2.8.0-rc7 2025-12-04T09:17:11.3479629Z * [new tag] v2.8.0-rc8 -> v2.8.0-rc8 2025-12-04T09:17:11.3481144Z * [new tag] v2.9.0 -> v2.9.0 2025-12-04T09:17:11.3482639Z * [new tag] v2.9.0-rc1 -> v2.9.0-rc1 2025-12-04T09:17:11.3484273Z * [new tag] v2.9.0-rc10 -> v2.9.0-rc10 2025-12-04T09:17:11.3485714Z * [new tag] v2.9.0-rc11 -> v2.9.0-rc11 2025-12-04T09:17:11.3487520Z * [new tag] v2.9.0-rc2 -> v2.9.0-rc2 2025-12-04T09:17:11.3489043Z * [new tag] v2.9.0-rc3 -> v2.9.0-rc3 2025-12-04T09:17:11.3490629Z * [new tag] v2.9.0-rc4 -> v2.9.0-rc4 2025-12-04T09:17:11.3492162Z * [new tag] v2.9.0-rc5 -> v2.9.0-rc5 2025-12-04T09:17:11.3493863Z * [new tag] v2.9.0-rc6 -> v2.9.0-rc6 2025-12-04T09:17:11.3495360Z * [new tag] v2.9.0-rc7 -> v2.9.0-rc7 2025-12-04T09:17:11.3497039Z * [new tag] v2.9.0-rc8 -> v2.9.0-rc8 2025-12-04T09:17:11.3498365Z * [new tag] v2.9.0-rc9 -> v2.9.0-rc9 2025-12-04T09:17:11.3499589Z * [new tag] v2.9.1 -> v2.9.1 2025-12-04T09:17:11.3501163Z * [new tag] v2.9.1-rc1 -> v2.9.1-rc1 2025-12-04T09:17:11.3502725Z * [new tag] v2.9.1-rc2 -> v2.9.1-rc2 2025-12-04T09:17:11.3504910Z * [new tag] viable/strict/1759343184 -> viable/strict/1759343184 2025-12-04T09:17:11.3506319Z * [new tag] viable/strict/1759346540 -> viable/strict/1759346540 2025-12-04T09:17:11.3507633Z * [new tag] viable/strict/1759348181 -> viable/strict/1759348181 2025-12-04T09:17:11.3509093Z * [new tag] viable/strict/1759350324 -> viable/strict/1759350324 2025-12-04T09:17:11.3510571Z * [new tag] viable/strict/1759351793 -> viable/strict/1759351793 2025-12-04T09:17:11.3511954Z * [new tag] viable/strict/1759353844 -> viable/strict/1759353844 2025-12-04T09:17:11.3513369Z * [new tag] viable/strict/1759355374 -> viable/strict/1759355374 2025-12-04T09:17:11.3514741Z * [new tag] viable/strict/1759357472 -> viable/strict/1759357472 2025-12-04T09:17:11.3516196Z * [new tag] viable/strict/1759361002 -> viable/strict/1759361002 2025-12-04T09:17:11.3518009Z * [new tag] viable/strict/1759362585 -> viable/strict/1759362585 2025-12-04T09:17:11.3519575Z * [new tag] viable/strict/1759365359 -> viable/strict/1759365359 2025-12-04T09:17:11.3521161Z * [new tag] viable/strict/1759370089 -> viable/strict/1759370089 2025-12-04T09:17:11.3522685Z * [new tag] viable/strict/1759377554 -> viable/strict/1759377554 2025-12-04T09:17:11.3524143Z * [new tag] viable/strict/1759379133 -> viable/strict/1759379133 2025-12-04T09:17:11.3528622Z * [new tag] viable/strict/1759389871 -> viable/strict/1759389871 2025-12-04T09:17:11.3530094Z * [new tag] viable/strict/1759393562 -> viable/strict/1759393562 2025-12-04T09:17:11.3531615Z * [new tag] viable/strict/1759395076 -> viable/strict/1759395076 2025-12-04T09:17:11.3533125Z * [new tag] viable/strict/1759398579 -> viable/strict/1759398579 2025-12-04T09:17:11.3534715Z * [new tag] viable/strict/1759404142 -> viable/strict/1759404142 2025-12-04T09:17:11.3536204Z * [new tag] viable/strict/1759405773 -> viable/strict/1759405773 2025-12-04T09:17:11.3537666Z * [new tag] viable/strict/1759408041 -> viable/strict/1759408041 2025-12-04T09:17:11.3539091Z * [new tag] viable/strict/1759411593 -> viable/strict/1759411593 2025-12-04T09:17:11.3540578Z * [new tag] viable/strict/1759427395 -> viable/strict/1759427395 2025-12-04T09:17:11.3541985Z * [new tag] viable/strict/1759434582 -> viable/strict/1759434582 2025-12-04T09:17:11.3543614Z * [new tag] viable/strict/1759436720 -> viable/strict/1759436720 2025-12-04T09:17:11.3545370Z * [new tag] viable/strict/1759440219 -> viable/strict/1759440219 2025-12-04T09:17:11.3546654Z * [new tag] viable/strict/1759441948 -> viable/strict/1759441948 2025-12-04T09:17:11.3548172Z * [new tag] viable/strict/1759443860 -> viable/strict/1759443860 2025-12-04T09:17:11.3549607Z * [new tag] viable/strict/1759445377 -> viable/strict/1759445377 2025-12-04T09:17:11.3551123Z * [new tag] viable/strict/1759447415 -> viable/strict/1759447415 2025-12-04T09:17:11.3552676Z * [new tag] viable/strict/1759451750 -> viable/strict/1759451750 2025-12-04T09:17:11.3554135Z * [new tag] viable/strict/1759453910 -> viable/strict/1759453910 2025-12-04T09:17:11.3555667Z * [new tag] viable/strict/1759456483 -> viable/strict/1759456483 2025-12-04T09:17:11.3557193Z * [new tag] viable/strict/1759459279 -> viable/strict/1759459279 2025-12-04T09:17:11.3558725Z * [new tag] viable/strict/1759460742 -> viable/strict/1759460742 2025-12-04T09:17:11.3560205Z * [new tag] viable/strict/1759462025 -> viable/strict/1759462025 2025-12-04T09:17:11.3561774Z * [new tag] viable/strict/1759469086 -> viable/strict/1759469086 2025-12-04T09:17:11.3563239Z * [new tag] viable/strict/1759470581 -> viable/strict/1759470581 2025-12-04T09:17:11.3564667Z * [new tag] viable/strict/1759472786 -> viable/strict/1759472786 2025-12-04T09:17:11.3566187Z * [new tag] viable/strict/1759476294 -> viable/strict/1759476294 2025-12-04T09:17:11.3567889Z * [new tag] viable/strict/1759479963 -> viable/strict/1759479963 2025-12-04T09:17:11.3569209Z * [new tag] viable/strict/1759492177 -> viable/strict/1759492177 2025-12-04T09:17:11.3570690Z * [new tag] viable/strict/1759519278 -> viable/strict/1759519278 2025-12-04T09:17:11.3572117Z * [new tag] viable/strict/1759524580 -> viable/strict/1759524580 2025-12-04T09:17:11.3573564Z * [new tag] viable/strict/1759528193 -> viable/strict/1759528193 2025-12-04T09:17:11.3575242Z * [new tag] viable/strict/1759533797 -> viable/strict/1759533797 2025-12-04T09:17:11.3576808Z * [new tag] viable/strict/1759542780 -> viable/strict/1759542780 2025-12-04T09:17:11.3578282Z * [new tag] viable/strict/1759549779 -> viable/strict/1759549779 2025-12-04T09:17:11.3579771Z * [new tag] viable/strict/1759555455 -> viable/strict/1759555455 2025-12-04T09:17:11.3581301Z * [new tag] viable/strict/1759559176 -> viable/strict/1759559176 2025-12-04T09:17:11.3582981Z * [new tag] viable/strict/1759560629 -> viable/strict/1759560629 2025-12-04T09:17:11.3584423Z * [new tag] viable/strict/1759569848 -> viable/strict/1759569848 2025-12-04T09:17:11.3586622Z * [new tag] viable/strict/1759571382 -> viable/strict/1759571382 2025-12-04T09:17:11.3588159Z * [new tag] viable/strict/1759573474 -> viable/strict/1759573474 2025-12-04T09:17:11.3589597Z * [new tag] viable/strict/1759618187 -> viable/strict/1759618187 2025-12-04T09:17:11.3591105Z * [new tag] viable/strict/1759626742 -> viable/strict/1759626742 2025-12-04T09:17:11.3592603Z * [new tag] viable/strict/1759632427 -> viable/strict/1759632427 2025-12-04T09:17:11.3594057Z * [new tag] viable/strict/1759634971 -> viable/strict/1759634971 2025-12-04T09:17:11.3595593Z * [new tag] viable/strict/1759661382 -> viable/strict/1759661382 2025-12-04T09:17:11.3597195Z * [new tag] viable/strict/1759663294 -> viable/strict/1759663294 2025-12-04T09:17:11.3598524Z * [new tag] viable/strict/1759708178 -> viable/strict/1759708178 2025-12-04T09:17:11.3600037Z * [new tag] viable/strict/1759715695 -> viable/strict/1759715695 2025-12-04T09:17:11.3601720Z * [new tag] viable/strict/1759728293 -> viable/strict/1759728293 2025-12-04T09:17:11.3603018Z * [new tag] viable/strict/1759735513 -> viable/strict/1759735513 2025-12-04T09:17:11.3604571Z * [new tag] viable/strict/1759739177 -> viable/strict/1759739177 2025-12-04T09:17:11.3606006Z * [new tag] viable/strict/1759758635 -> viable/strict/1759758635 2025-12-04T09:17:11.3607444Z * [new tag] viable/strict/1759765784 -> viable/strict/1759765784 2025-12-04T09:17:11.3609024Z * [new tag] viable/strict/1759767948 -> viable/strict/1759767948 2025-12-04T09:17:11.3610517Z * [new tag] viable/strict/1759771461 -> viable/strict/1759771461 2025-12-04T09:17:11.3611842Z * [new tag] viable/strict/1759776706 -> viable/strict/1759776706 2025-12-04T09:17:11.3613389Z * [new tag] viable/strict/1759782317 -> viable/strict/1759782317 2025-12-04T09:17:11.3614987Z * [new tag] viable/strict/1759783777 -> viable/strict/1759783777 2025-12-04T09:17:11.3616436Z * [new tag] viable/strict/1759785815 -> viable/strict/1759785815 2025-12-04T09:17:11.3618024Z * [new tag] viable/strict/1759789459 -> viable/strict/1759789459 2025-12-04T09:17:11.3619628Z * [new tag] viable/strict/1759790974 -> viable/strict/1759790974 2025-12-04T09:17:11.3620943Z * [new tag] viable/strict/1759794583 -> viable/strict/1759794583 2025-12-04T09:17:11.3622350Z * [new tag] viable/strict/1759797408 -> viable/strict/1759797408 2025-12-04T09:17:11.3623984Z * [new tag] viable/strict/1759799518 -> viable/strict/1759799518 2025-12-04T09:17:11.3625836Z * [new tag] viable/strict/1759804909 -> viable/strict/1759804909 2025-12-04T09:17:11.3627251Z * [new tag] viable/strict/1759807643 -> viable/strict/1759807643 2025-12-04T09:17:11.3628819Z * [new tag] viable/strict/1759809089 -> viable/strict/1759809089 2025-12-04T09:17:11.3630305Z * [new tag] viable/strict/1759811145 -> viable/strict/1759811145 2025-12-04T09:17:11.3631855Z * [new tag] viable/strict/1759812581 -> viable/strict/1759812581 2025-12-04T09:17:11.3633358Z * [new tag] viable/strict/1759814683 -> viable/strict/1759814683 2025-12-04T09:17:11.3634815Z * [new tag] viable/strict/1759821889 -> viable/strict/1759821889 2025-12-04T09:17:11.3636412Z * [new tag] viable/strict/1759823376 -> viable/strict/1759823376 2025-12-04T09:17:11.3637886Z * [new tag] viable/strict/1759827107 -> viable/strict/1759827107 2025-12-04T09:17:11.3639435Z * [new tag] viable/strict/1759830577 -> viable/strict/1759830577 2025-12-04T09:17:11.3641035Z * [new tag] viable/strict/1759832720 -> viable/strict/1759832720 2025-12-04T09:17:11.3642489Z * [new tag] viable/strict/1759842063 -> viable/strict/1759842063 2025-12-04T09:17:11.3644018Z * [new tag] viable/strict/1759847121 -> viable/strict/1759847121 2025-12-04T09:17:11.3645806Z * [new tag] viable/strict/1759850721 -> viable/strict/1759850721 2025-12-04T09:17:11.3647246Z * [new tag] viable/strict/1759857870 -> viable/strict/1759857870 2025-12-04T09:17:11.3648794Z * [new tag] viable/strict/1759863143 -> viable/strict/1759863143 2025-12-04T09:17:11.3650247Z * [new tag] viable/strict/1759875874 -> viable/strict/1759875874 2025-12-04T09:17:11.3651608Z * [new tag] viable/strict/1759877385 -> viable/strict/1759877385 2025-12-04T09:17:11.3653088Z * [new tag] viable/strict/1759883801 -> viable/strict/1759883801 2025-12-04T09:17:11.3654724Z * [new tag] viable/strict/1759885922 -> viable/strict/1759885922 2025-12-04T09:17:11.3656122Z * [new tag] viable/strict/1759888488 -> viable/strict/1759888488 2025-12-04T09:17:11.3657617Z * [new tag] viable/strict/1759895471 -> viable/strict/1759895471 2025-12-04T09:17:11.3659144Z * [new tag] viable/strict/1759904803 -> viable/strict/1759904803 2025-12-04T09:17:11.3660826Z * [new tag] viable/strict/1759908300 -> viable/strict/1759908300 2025-12-04T09:17:11.3662330Z * [new tag] viable/strict/1759915520 -> viable/strict/1759915520 2025-12-04T09:17:11.3663936Z * [new tag] viable/strict/1759916978 -> viable/strict/1759916978 2025-12-04T09:17:11.3665168Z * [new tag] viable/strict/1759930024 -> viable/strict/1759930024 2025-12-04T09:17:11.3666737Z * [new tag] viable/strict/1759948122 -> viable/strict/1759948122 2025-12-04T09:17:11.3668314Z * [new tag] viable/strict/1759952983 -> viable/strict/1759952983 2025-12-04T09:17:11.3669883Z * [new tag] viable/strict/1759955121 -> viable/strict/1759955121 2025-12-04T09:17:11.3671343Z * [new tag] viable/strict/1759962298 -> viable/strict/1759962298 2025-12-04T09:17:11.3672817Z * [new tag] viable/strict/1759965837 -> viable/strict/1759965837 2025-12-04T09:17:11.3674435Z * [new tag] viable/strict/1759970213 -> viable/strict/1759970213 2025-12-04T09:17:11.3675962Z * [new tag] viable/strict/1759974894 -> viable/strict/1759974894 2025-12-04T09:17:11.3677441Z * [new tag] viable/strict/1759977763 -> viable/strict/1759977763 2025-12-04T09:17:11.3679007Z * [new tag] viable/strict/1759979241 -> viable/strict/1759979241 2025-12-04T09:17:11.3680589Z * [new tag] viable/strict/1759985417 -> viable/strict/1759985417 2025-12-04T09:17:11.3682105Z * [new tag] viable/strict/1759987490 -> viable/strict/1759987490 2025-12-04T09:17:11.3684077Z * [new tag] viable/strict/1759996180 -> viable/strict/1759996180 2025-12-04T09:17:11.3685597Z * [new tag] viable/strict/1760065682 -> viable/strict/1760065682 2025-12-04T09:17:11.3687115Z * [new tag] viable/strict/1760066894 -> viable/strict/1760066894 2025-12-04T09:17:11.3688603Z * [new tag] viable/strict/1760070345 -> viable/strict/1760070345 2025-12-04T09:17:11.3690211Z * [new tag] viable/strict/1760089782 -> viable/strict/1760089782 2025-12-04T09:17:11.3691702Z * [new tag] viable/strict/1760091921 -> viable/strict/1760091921 2025-12-04T09:17:11.3693274Z * [new tag] viable/strict/1760127924 -> viable/strict/1760127924 2025-12-04T09:17:11.3694734Z * [new tag] viable/strict/1760129489 -> viable/strict/1760129489 2025-12-04T09:17:11.3696306Z * [new tag] viable/strict/1760132980 -> viable/strict/1760132980 2025-12-04T09:17:11.3697880Z * [new tag] viable/strict/1760135060 -> viable/strict/1760135060 2025-12-04T09:17:11.3699443Z * [new tag] viable/strict/1760215782 -> viable/strict/1760215782 2025-12-04T09:17:11.3700918Z * [new tag] viable/strict/1760273849 -> viable/strict/1760273849 2025-12-04T09:17:11.3702372Z * [new tag] viable/strict/1760275517 -> viable/strict/1760275517 2025-12-04T09:17:11.3704074Z * [new tag] viable/strict/1760276979 -> viable/strict/1760276979 2025-12-04T09:17:11.3705622Z * [new tag] viable/strict/1760279007 -> viable/strict/1760279007 2025-12-04T09:17:11.3706831Z * [new tag] viable/strict/1760286328 -> viable/strict/1760286328 2025-12-04T09:17:11.3708257Z * [new tag] viable/strict/1760493304 -> viable/strict/1760493304 2025-12-04T09:17:11.3709839Z * [new tag] viable/strict/1760496298 -> viable/strict/1760496298 2025-12-04T09:17:11.3711213Z * [new tag] viable/strict/1760518396 -> viable/strict/1760518396 2025-12-04T09:17:11.3712699Z * [new tag] viable/strict/1760534864 -> viable/strict/1760534864 2025-12-04T09:17:11.3714244Z * [new tag] viable/strict/1760549062 -> viable/strict/1760549062 2025-12-04T09:17:11.3715829Z * [new tag] viable/strict/1760552799 -> viable/strict/1760552799 2025-12-04T09:17:11.3717359Z * [new tag] viable/strict/1760554355 -> viable/strict/1760554355 2025-12-04T09:17:11.3718917Z * [new tag] viable/strict/1760556275 -> viable/strict/1760556275 2025-12-04T09:17:11.3720448Z * [new tag] viable/strict/1760564979 -> viable/strict/1760564979 2025-12-04T09:17:11.3722011Z * [new tag] viable/strict/1760567049 -> viable/strict/1760567049 2025-12-04T09:17:11.3723901Z * [new tag] viable/strict/1760568585 -> viable/strict/1760568585 2025-12-04T09:17:11.3725531Z * [new tag] viable/strict/1760570630 -> viable/strict/1760570630 2025-12-04T09:17:11.3727039Z * [new tag] viable/strict/1760572180 -> viable/strict/1760572180 2025-12-04T09:17:11.3728524Z * [new tag] viable/strict/1760575094 -> viable/strict/1760575094 2025-12-04T09:17:11.3730240Z * [new tag] viable/strict/1760579709 -> viable/strict/1760579709 2025-12-04T09:17:11.3732544Z * [new tag] viable/strict/1760582614 -> viable/strict/1760582614 2025-12-04T09:17:11.3733426Z * [new tag] viable/strict/1760586815 -> viable/strict/1760586815 2025-12-04T09:17:11.3735086Z * [new tag] viable/strict/1760588829 -> viable/strict/1760588829 2025-12-04T09:17:11.3736577Z * [new tag] viable/strict/1760590200 -> viable/strict/1760590200 2025-12-04T09:17:11.3738160Z * [new tag] viable/strict/1760592311 -> viable/strict/1760592311 2025-12-04T09:17:11.3739655Z * [new tag] viable/strict/1760619733 -> viable/strict/1760619733 2025-12-04T09:17:11.3740990Z * [new tag] viable/strict/1760628335 -> viable/strict/1760628335 2025-12-04T09:17:11.3742473Z * [new tag] viable/strict/1760635490 -> viable/strict/1760635490 2025-12-04T09:17:11.3744116Z * [new tag] viable/strict/1760640743 -> viable/strict/1760640743 2025-12-04T09:17:11.3745556Z * [new tag] viable/strict/1760642528 -> viable/strict/1760642528 2025-12-04T09:17:11.3747063Z * [new tag] viable/strict/1760646330 -> viable/strict/1760646330 2025-12-04T09:17:11.3748500Z * [new tag] viable/strict/1760666101 -> viable/strict/1760666101 2025-12-04T09:17:11.3750005Z * [new tag] viable/strict/1760668990 -> viable/strict/1760668990 2025-12-04T09:17:11.3751520Z * [new tag] viable/strict/1760670600 -> viable/strict/1760670600 2025-12-04T09:17:11.3752989Z * [new tag] viable/strict/1760671704 -> viable/strict/1760671704 2025-12-04T09:17:11.3754451Z * [new tag] viable/strict/1760673121 -> viable/strict/1760673121 2025-12-04T09:17:11.3755959Z * [new tag] viable/strict/1760675352 -> viable/strict/1760675352 2025-12-04T09:17:11.3757535Z * [new tag] viable/strict/1760696731 -> viable/strict/1760696731 2025-12-04T09:17:11.3760383Z * [new tag] viable/strict/1760723515 -> viable/strict/1760723515 2025-12-04T09:17:11.3761861Z * [new tag] viable/strict/1760727234 -> viable/strict/1760727234 2025-12-04T09:17:11.3763453Z * [new tag] viable/strict/1760730578 -> viable/strict/1760730578 2025-12-04T09:17:11.3764941Z * [new tag] viable/strict/1760732726 -> viable/strict/1760732726 2025-12-04T09:17:11.3766674Z * [new tag] viable/strict/1760734180 -> viable/strict/1760734180 2025-12-04T09:17:11.3768029Z * [new tag] viable/strict/1760736251 -> viable/strict/1760736251 2025-12-04T09:17:11.3769543Z * [new tag] viable/strict/1760737772 -> viable/strict/1760737772 2025-12-04T09:17:11.3771067Z * [new tag] viable/strict/1760758005 -> viable/strict/1760758005 2025-12-04T09:17:11.3772574Z * [new tag] viable/strict/1760761532 -> viable/strict/1760761532 2025-12-04T09:17:11.3774063Z * [new tag] viable/strict/1760802581 -> viable/strict/1760802581 2025-12-04T09:17:11.3775531Z * [new tag] viable/strict/1760827772 -> viable/strict/1760827772 2025-12-04T09:17:11.3777029Z * [new tag] viable/strict/1760834524 -> viable/strict/1760834524 2025-12-04T09:17:11.3778590Z * [new tag] viable/strict/1760845009 -> viable/strict/1760845009 2025-12-04T09:17:11.3780192Z * [new tag] viable/strict/1760876836 -> viable/strict/1760876836 2025-12-04T09:17:11.3781678Z * [new tag] viable/strict/1760880329 -> viable/strict/1760880329 2025-12-04T09:17:11.3783762Z * [new tag] viable/strict/1760888987 -> viable/strict/1760888987 2025-12-04T09:17:11.3785253Z * [new tag] viable/strict/1760912664 -> viable/strict/1760912664 2025-12-04T09:17:11.3786763Z * [new tag] viable/strict/1760925321 -> viable/strict/1760925321 2025-12-04T09:17:11.3788300Z * [new tag] viable/strict/1760931488 -> viable/strict/1760931488 2025-12-04T09:17:11.3789812Z * [new tag] viable/strict/1760932693 -> viable/strict/1760932693 2025-12-04T09:17:11.3791350Z * [new tag] viable/strict/1761004184 -> viable/strict/1761004184 2025-12-04T09:17:11.3792830Z * [new tag] viable/strict/1761014748 -> viable/strict/1761014748 2025-12-04T09:17:11.3794288Z * [new tag] viable/strict/1761017491 -> viable/strict/1761017491 2025-12-04T09:17:11.3795849Z * [new tag] viable/strict/1761018806 -> viable/strict/1761018806 2025-12-04T09:17:11.3797428Z * [new tag] viable/strict/1761020754 -> viable/strict/1761020754 2025-12-04T09:17:11.3798959Z * [new tag] viable/strict/1761024303 -> viable/strict/1761024303 2025-12-04T09:17:11.3800417Z * [new tag] viable/strict/1761029582 -> viable/strict/1761029582 2025-12-04T09:17:11.3801925Z * [new tag] viable/strict/1761031535 -> viable/strict/1761031535 2025-12-04T09:17:11.3803376Z * [new tag] viable/strict/1761035196 -> viable/strict/1761035196 2025-12-04T09:17:11.3804967Z * [new tag] viable/strict/1761045825 -> viable/strict/1761045825 2025-12-04T09:17:11.3806561Z * [new tag] viable/strict/1761054796 -> viable/strict/1761054796 2025-12-04T09:17:11.3808032Z * [new tag] viable/strict/1761060314 -> viable/strict/1761060314 2025-12-04T09:17:11.3809570Z * [new tag] viable/strict/1761071198 -> viable/strict/1761071198 2025-12-04T09:17:11.3811100Z * [new tag] viable/strict/1761074628 -> viable/strict/1761074628 2025-12-04T09:17:11.3812614Z * [new tag] viable/strict/1761078351 -> viable/strict/1761078351 2025-12-04T09:17:11.3814110Z * [new tag] viable/strict/1761079822 -> viable/strict/1761079822 2025-12-04T09:17:11.3815929Z * [new tag] viable/strict/1761081873 -> viable/strict/1761081873 2025-12-04T09:17:11.3817441Z * [new tag] viable/strict/1761083392 -> viable/strict/1761083392 2025-12-04T09:17:11.3819001Z * [new tag] viable/strict/1761085465 -> viable/strict/1761085465 2025-12-04T09:17:11.3820548Z * [new tag] viable/strict/1761089099 -> viable/strict/1761089099 2025-12-04T09:17:11.3822206Z * [new tag] viable/strict/1761095535 -> viable/strict/1761095535 2025-12-04T09:17:11.3823468Z * [new tag] viable/strict/1761098119 -> viable/strict/1761098119 2025-12-04T09:17:11.3825935Z * [new tag] viable/strict/1761101330 -> viable/strict/1761101330 2025-12-04T09:17:11.3827305Z * [new tag] viable/strict/1761114425 -> viable/strict/1761114425 2025-12-04T09:17:11.3828899Z * [new tag] viable/strict/1761116036 -> viable/strict/1761116036 2025-12-04T09:17:11.3830450Z * [new tag] viable/strict/1761119379 -> viable/strict/1761119379 2025-12-04T09:17:11.3831988Z * [new tag] viable/strict/1761121601 -> viable/strict/1761121601 2025-12-04T09:17:11.3833425Z * [new tag] viable/strict/1761123234 -> viable/strict/1761123234 2025-12-04T09:17:11.3834902Z * [new tag] viable/strict/1761126621 -> viable/strict/1761126621 2025-12-04T09:17:11.3836380Z * [new tag] viable/strict/1761132259 -> viable/strict/1761132259 2025-12-04T09:17:11.3837955Z * [new tag] viable/strict/1761146746 -> viable/strict/1761146746 2025-12-04T09:17:11.3839575Z * [new tag] viable/strict/1761164752 -> viable/strict/1761164752 2025-12-04T09:17:11.3841045Z * [new tag] viable/strict/1761166198 -> viable/strict/1761166198 2025-12-04T09:17:11.3842586Z * [new tag] viable/strict/1761175424 -> viable/strict/1761175424 2025-12-04T09:17:11.3844084Z * [new tag] viable/strict/1761176983 -> viable/strict/1761176983 2025-12-04T09:17:11.3845770Z * [new tag] viable/strict/1761179891 -> viable/strict/1761179891 2025-12-04T09:17:11.3847345Z * [new tag] viable/strict/1761181930 -> viable/strict/1761181930 2025-12-04T09:17:11.3848843Z * [new tag] viable/strict/1761184516 -> viable/strict/1761184516 2025-12-04T09:17:11.3850396Z * [new tag] viable/strict/1761190179 -> viable/strict/1761190179 2025-12-04T09:17:11.3851932Z * [new tag] viable/strict/1761193558 -> viable/strict/1761193558 2025-12-04T09:17:11.3853376Z * [new tag] viable/strict/1761207990 -> viable/strict/1761207990 2025-12-04T09:17:11.3854864Z * [new tag] viable/strict/1761229539 -> viable/strict/1761229539 2025-12-04T09:17:11.3856583Z * [new tag] viable/strict/1761244031 -> viable/strict/1761244031 2025-12-04T09:17:11.3858150Z * [new tag] viable/strict/1761248986 -> viable/strict/1761248986 2025-12-04T09:17:11.3859623Z * [new tag] viable/strict/1761259791 -> viable/strict/1761259791 2025-12-04T09:17:11.3861087Z * [new tag] viable/strict/1761266139 -> viable/strict/1761266139 2025-12-04T09:17:11.3862674Z * [new tag] viable/strict/1761268316 -> viable/strict/1761268316 2025-12-04T09:17:11.3864336Z * [new tag] viable/strict/1761273805 -> viable/strict/1761273805 2025-12-04T09:17:11.3865746Z * [new tag] viable/strict/1761275261 -> viable/strict/1761275261 2025-12-04T09:17:11.3867287Z * [new tag] viable/strict/1761277913 -> viable/strict/1761277913 2025-12-04T09:17:11.3868854Z * [new tag] viable/strict/1761290701 -> viable/strict/1761290701 2025-12-04T09:17:11.3870438Z * [new tag] viable/strict/1761294396 -> viable/strict/1761294396 2025-12-04T09:17:11.3871932Z * [new tag] viable/strict/1761303047 -> viable/strict/1761303047 2025-12-04T09:17:11.3873432Z * [new tag] viable/strict/1761335388 -> viable/strict/1761335388 2025-12-04T09:17:11.3874892Z * [new tag] viable/strict/1761337551 -> viable/strict/1761337551 2025-12-04T09:17:11.3876589Z * [new tag] viable/strict/1761339007 -> viable/strict/1761339007 2025-12-04T09:17:11.3877931Z * [new tag] viable/strict/1761341050 -> viable/strict/1761341050 2025-12-04T09:17:11.3879633Z * [new tag] viable/strict/1761346188 -> viable/strict/1761346188 2025-12-04T09:17:11.3881240Z * [new tag] viable/strict/1761349792 -> viable/strict/1761349792 2025-12-04T09:17:11.3883188Z * [new tag] viable/strict/1761352620 -> viable/strict/1761352620 2025-12-04T09:17:11.3884682Z * [new tag] viable/strict/1761354730 -> viable/strict/1761354730 2025-12-04T09:17:11.3886250Z * [new tag] viable/strict/1761357298 -> viable/strict/1761357298 2025-12-04T09:17:11.3887819Z * [new tag] viable/strict/1761360201 -> viable/strict/1761360201 2025-12-04T09:17:11.3889260Z * [new tag] viable/strict/1761361753 -> viable/strict/1761361753 2025-12-04T09:17:11.3890778Z * [new tag] viable/strict/1761364351 -> viable/strict/1761364351 2025-12-04T09:17:11.3892272Z * [new tag] viable/strict/1761366338 -> viable/strict/1761366338 2025-12-04T09:17:11.3893900Z * [new tag] viable/strict/1761367802 -> viable/strict/1761367802 2025-12-04T09:17:11.3895465Z * [new tag] viable/strict/1761369889 -> viable/strict/1761369889 2025-12-04T09:17:11.3897049Z * [new tag] viable/strict/1761371385 -> viable/strict/1761371385 2025-12-04T09:17:11.3898619Z * [new tag] viable/strict/1761373581 -> viable/strict/1761373581 2025-12-04T09:17:11.3900249Z * [new tag] viable/strict/1761375054 -> viable/strict/1761375054 2025-12-04T09:17:11.3901879Z * [new tag] viable/strict/1761421785 -> viable/strict/1761421785 2025-12-04T09:17:11.3903548Z * [new tag] viable/strict/1761434614 -> viable/strict/1761434614 2025-12-04T09:17:11.3905507Z * [new tag] viable/strict/1761439254 -> viable/strict/1761439254 2025-12-04T09:17:11.3907017Z * [new tag] viable/strict/1761454187 -> viable/strict/1761454187 2025-12-04T09:17:11.3908669Z * [new tag] viable/strict/1761459991 -> viable/strict/1761459991 2025-12-04T09:17:11.3910400Z * [new tag] viable/strict/1761470668 -> viable/strict/1761470668 2025-12-04T09:17:11.3912329Z * [new tag] viable/strict/1761472188 -> viable/strict/1761472188 2025-12-04T09:17:11.3913966Z * [new tag] viable/strict/1761503178 -> viable/strict/1761503178 2025-12-04T09:17:11.3915500Z * [new tag] viable/strict/1761517492 -> viable/strict/1761517492 2025-12-04T09:17:11.3917015Z * [new tag] viable/strict/1761518981 -> viable/strict/1761518981 2025-12-04T09:17:11.3918543Z * [new tag] viable/strict/1761533609 -> viable/strict/1761533609 2025-12-04T09:17:11.3919962Z * [new tag] viable/strict/1761546438 -> viable/strict/1761546438 2025-12-04T09:17:11.3921584Z * [new tag] viable/strict/1761548133 -> viable/strict/1761548133 2025-12-04T09:17:11.3923345Z * [new tag] viable/strict/1761555186 -> viable/strict/1761555186 2025-12-04T09:17:11.3924972Z * [new tag] viable/strict/1761557178 -> viable/strict/1761557178 2025-12-04T09:17:11.3929771Z * [new tag] viable/strict/1761560772 -> viable/strict/1761560772 2025-12-04T09:17:11.3931473Z * [new tag] viable/strict/1761562266 -> viable/strict/1761562266 2025-12-04T09:17:11.3932928Z * [new tag] viable/strict/1761564260 -> viable/strict/1761564260 2025-12-04T09:17:11.3934496Z * [new tag] viable/strict/1761568072 -> viable/strict/1761568072 2025-12-04T09:17:11.3936073Z * [new tag] viable/strict/1761571683 -> viable/strict/1761571683 2025-12-04T09:17:11.3937518Z * [new tag] viable/strict/1761580199 -> viable/strict/1761580199 2025-12-04T09:17:11.3939065Z * [new tag] viable/strict/1761587383 -> viable/strict/1761587383 2025-12-04T09:17:11.3940533Z * [new tag] viable/strict/1761591165 -> viable/strict/1761591165 2025-12-04T09:17:11.3942091Z * [new tag] viable/strict/1761594575 -> viable/strict/1761594575 2025-12-04T09:17:11.3943742Z * [new tag] viable/strict/1761596710 -> viable/strict/1761596710 2025-12-04T09:17:11.3945261Z * [new tag] viable/strict/1761598189 -> viable/strict/1761598189 2025-12-04T09:17:11.3946860Z * [new tag] viable/strict/1761600254 -> viable/strict/1761600254 2025-12-04T09:17:11.3948400Z * [new tag] viable/strict/1761603879 -> viable/strict/1761603879 2025-12-04T09:17:11.3949865Z * [new tag] viable/strict/1761605429 -> viable/strict/1761605429 2025-12-04T09:17:11.3951565Z * [new tag] viable/strict/1761607468 -> viable/strict/1761607468 2025-12-04T09:17:11.3953105Z * [new tag] viable/strict/1761608983 -> viable/strict/1761608983 2025-12-04T09:17:11.3954654Z * [new tag] viable/strict/1761611846 -> viable/strict/1761611846 2025-12-04T09:17:11.3956187Z * [new tag] viable/strict/1761613922 -> viable/strict/1761613922 2025-12-04T09:17:11.3957647Z * [new tag] viable/strict/1761616504 -> viable/strict/1761616504 2025-12-04T09:17:11.3959003Z * [new tag] viable/strict/1761619599 -> viable/strict/1761619599 2025-12-04T09:17:11.3966621Z * [new tag] viable/strict/1761686693 -> viable/strict/1761686693 2025-12-04T09:17:11.3967015Z * [new tag] viable/strict/1761688179 -> viable/strict/1761688179 2025-12-04T09:17:11.3967206Z * [new tag] viable/strict/1761691973 -> viable/strict/1761691973 2025-12-04T09:17:11.3967405Z * [new tag] viable/strict/1761693884 -> viable/strict/1761693884 2025-12-04T09:17:11.3967585Z * [new tag] viable/strict/1761695389 -> viable/strict/1761695389 2025-12-04T09:17:11.3968049Z * [new tag] viable/strict/1761698408 -> viable/strict/1761698408 2025-12-04T09:17:11.3969849Z * [new tag] viable/strict/1761702931 -> viable/strict/1761702931 2025-12-04T09:17:11.3971363Z * [new tag] viable/strict/1761706307 -> viable/strict/1761706307 2025-12-04T09:17:11.3972989Z * [new tag] viable/strict/1761709065 -> viable/strict/1761709065 2025-12-04T09:17:11.3974584Z * [new tag] viable/strict/1761710285 -> viable/strict/1761710285 2025-12-04T09:17:11.3976179Z * [new tag] viable/strict/1761711983 -> viable/strict/1761711983 2025-12-04T09:17:11.3977830Z * [new tag] viable/strict/1761713514 -> viable/strict/1761713514 2025-12-04T09:17:11.3979558Z * [new tag] viable/strict/1761715523 -> viable/strict/1761715523 2025-12-04T09:17:11.3981144Z * [new tag] viable/strict/1761727973 -> viable/strict/1761727973 2025-12-04T09:17:11.3982824Z * [new tag] viable/strict/1761751558 -> viable/strict/1761751558 2025-12-04T09:17:11.3984705Z * [new tag] viable/strict/1761755187 -> viable/strict/1761755187 2025-12-04T09:17:11.3986247Z * [new tag] viable/strict/1761756826 -> viable/strict/1761756826 2025-12-04T09:17:11.3988394Z * [new tag] viable/strict/1761769551 -> viable/strict/1761769551 2025-12-04T09:17:11.3990078Z * [new tag] viable/strict/1761771032 -> viable/strict/1761771032 2025-12-04T09:17:11.3991625Z * [new tag] viable/strict/1761773101 -> viable/strict/1761773101 2025-12-04T09:17:11.3993202Z * [new tag] viable/strict/1761781792 -> viable/strict/1761781792 2025-12-04T09:17:11.3994898Z * [new tag] viable/strict/1761784788 -> viable/strict/1761784788 2025-12-04T09:17:11.3996441Z * [new tag] viable/strict/1761786740 -> viable/strict/1761786740 2025-12-04T09:17:11.3998125Z * [new tag] viable/strict/1761789332 -> viable/strict/1761789332 2025-12-04T09:17:11.4000205Z * [new tag] viable/strict/1761792569 -> viable/strict/1761792569 2025-12-04T09:17:11.4001808Z * [new tag] viable/strict/1761795289 -> viable/strict/1761795289 2025-12-04T09:17:11.4003342Z * [new tag] viable/strict/1761798345 -> viable/strict/1761798345 2025-12-04T09:17:11.4004936Z * [new tag] viable/strict/1761799827 -> viable/strict/1761799827 2025-12-04T09:17:11.4006596Z * [new tag] viable/strict/1761805604 -> viable/strict/1761805604 2025-12-04T09:17:11.4008182Z * [new tag] viable/strict/1761807202 -> viable/strict/1761807202 2025-12-04T09:17:11.4009809Z * [new tag] viable/strict/1761809094 -> viable/strict/1761809094 2025-12-04T09:17:11.4011347Z * [new tag] viable/strict/1761810576 -> viable/strict/1761810576 2025-12-04T09:17:11.4013002Z * [new tag] viable/strict/1761812771 -> viable/strict/1761812771 2025-12-04T09:17:11.4014582Z * [new tag] viable/strict/1761814363 -> viable/strict/1761814363 2025-12-04T09:17:11.4016197Z * [new tag] viable/strict/1761857410 -> viable/strict/1761857410 2025-12-04T09:17:11.4017868Z * [new tag] viable/strict/1761860985 -> viable/strict/1761860985 2025-12-04T09:17:11.4019444Z * [new tag] viable/strict/1761863094 -> viable/strict/1761863094 2025-12-04T09:17:11.4021003Z * [new tag] viable/strict/1761864590 -> viable/strict/1761864590 2025-12-04T09:17:11.4022658Z * [new tag] viable/strict/1761866675 -> viable/strict/1761866675 2025-12-04T09:17:11.4024670Z * [new tag] viable/strict/1761868178 -> viable/strict/1761868178 2025-12-04T09:17:11.4026442Z * [new tag] viable/strict/1761871111 -> viable/strict/1761871111 2025-12-04T09:17:11.4028085Z * [new tag] viable/strict/1761873126 -> viable/strict/1761873126 2025-12-04T09:17:11.4029721Z * [new tag] viable/strict/1761875714 -> viable/strict/1761875714 2025-12-04T09:17:11.4031274Z * [new tag] viable/strict/1761878924 -> viable/strict/1761878924 2025-12-04T09:17:11.4032949Z * [new tag] viable/strict/1761881727 -> viable/strict/1761881727 2025-12-04T09:17:11.4034479Z * [new tag] viable/strict/1761882959 -> viable/strict/1761882959 2025-12-04T09:17:11.4036042Z * [new tag] viable/strict/1761886268 -> viable/strict/1761886268 2025-12-04T09:17:11.4037739Z * [new tag] viable/strict/1761893641 -> viable/strict/1761893641 2025-12-04T09:17:11.4039376Z * [new tag] viable/strict/1761931517 -> viable/strict/1761931517 2025-12-04T09:17:11.4040980Z * [new tag] viable/strict/1761933080 -> viable/strict/1761933080 2025-12-04T09:17:11.4042588Z * [new tag] viable/strict/1761935217 -> viable/strict/1761935217 2025-12-04T09:17:11.4044199Z * [new tag] viable/strict/1761938533 -> viable/strict/1761938533 2025-12-04T09:17:11.4045867Z * [new tag] viable/strict/1761940184 -> viable/strict/1761940184 2025-12-04T09:17:11.4047546Z * [new tag] viable/strict/1761942338 -> viable/strict/1761942338 2025-12-04T09:17:11.4049141Z * [new tag] viable/strict/1761946100 -> viable/strict/1761946100 2025-12-04T09:17:11.4050721Z * [new tag] viable/strict/1761947374 -> viable/strict/1761947374 2025-12-04T09:17:11.4052308Z * [new tag] viable/strict/1761950978 -> viable/strict/1761950978 2025-12-04T09:17:11.4054109Z * [new tag] viable/strict/1761957727 -> viable/strict/1761957727 2025-12-04T09:17:11.4055539Z * [new tag] viable/strict/1761959532 -> viable/strict/1761959532 2025-12-04T09:17:11.4057149Z * [new tag] viable/strict/1761965366 -> viable/strict/1761965366 2025-12-04T09:17:11.4058922Z * [new tag] viable/strict/1761968066 -> viable/strict/1761968066 2025-12-04T09:17:11.4060518Z * [new tag] viable/strict/1761969322 -> viable/strict/1761969322 2025-12-04T09:17:11.4062068Z * [new tag] viable/strict/1761974723 -> viable/strict/1761974723 2025-12-04T09:17:11.4063818Z * [new tag] viable/strict/1761981837 -> viable/strict/1761981837 2025-12-04T09:17:11.4065466Z * [new tag] viable/strict/1761985546 -> viable/strict/1761985546 2025-12-04T09:17:11.4067076Z * [new tag] viable/strict/1761987030 -> viable/strict/1761987030 2025-12-04T09:17:11.4068741Z * [new tag] viable/strict/1762003554 -> viable/strict/1762003554 2025-12-04T09:17:11.4070337Z * [new tag] viable/strict/1762021560 -> viable/strict/1762021560 2025-12-04T09:17:11.4071875Z * [new tag] viable/strict/1762032190 -> viable/strict/1762032190 2025-12-04T09:17:11.4073578Z * [new tag] viable/strict/1762040981 -> viable/strict/1762040981 2025-12-04T09:17:11.4075097Z * [new tag] viable/strict/1762048525 -> viable/strict/1762048525 2025-12-04T09:17:11.4076793Z * [new tag] viable/strict/1762104223 -> viable/strict/1762104223 2025-12-04T09:17:11.4078489Z * [new tag] viable/strict/1762105778 -> viable/strict/1762105778 2025-12-04T09:17:11.4080070Z * [new tag] viable/strict/1762115109 -> viable/strict/1762115109 2025-12-04T09:17:11.4081673Z * [new tag] viable/strict/1762125840 -> viable/strict/1762125840 2025-12-04T09:17:11.4083178Z * [new tag] viable/strict/1762127377 -> viable/strict/1762127377 2025-12-04T09:17:11.4085083Z * [new tag] viable/strict/1762134925 -> viable/strict/1762134925 2025-12-04T09:17:11.4086643Z * [new tag] viable/strict/1762138338 -> viable/strict/1762138338 2025-12-04T09:17:11.4088308Z * [new tag] viable/strict/1762148993 -> viable/strict/1762148993 2025-12-04T09:17:11.4089994Z * [new tag] viable/strict/1762152871 -> viable/strict/1762152871 2025-12-04T09:17:11.4091550Z * [new tag] viable/strict/1762156183 -> viable/strict/1762156183 2025-12-04T09:17:11.4093791Z * [new tag] viable/strict/1762163457 -> viable/strict/1762163457 2025-12-04T09:17:11.4095328Z * [new tag] viable/strict/1762165569 -> viable/strict/1762165569 2025-12-04T09:17:11.4096939Z * [new tag] viable/strict/1762169035 -> viable/strict/1762169035 2025-12-04T09:17:11.4098626Z * [new tag] viable/strict/1762174936 -> viable/strict/1762174936 2025-12-04T09:17:11.4100130Z * [new tag] viable/strict/1762194412 -> viable/strict/1762194412 2025-12-04T09:17:11.4101789Z * [new tag] viable/strict/1762195876 -> viable/strict/1762195876 2025-12-04T09:17:11.4103430Z * [new tag] viable/strict/1762197788 -> viable/strict/1762197788 2025-12-04T09:17:11.4105136Z * [new tag] viable/strict/1762199389 -> viable/strict/1762199389 2025-12-04T09:17:11.4106898Z * [new tag] viable/strict/1762206585 -> viable/strict/1762206585 2025-12-04T09:17:11.4108614Z * [new tag] viable/strict/1762210184 -> viable/strict/1762210184 2025-12-04T09:17:11.4110133Z * [new tag] viable/strict/1762218736 -> viable/strict/1762218736 2025-12-04T09:17:11.4111760Z * [new tag] viable/strict/1762224529 -> viable/strict/1762224529 2025-12-04T09:17:11.4113423Z * [new tag] viable/strict/1762227253 -> viable/strict/1762227253 2025-12-04T09:17:11.4114787Z * [new tag] viable/strict/1762228515 -> viable/strict/1762228515 2025-12-04T09:17:11.4116468Z * [new tag] viable/strict/1762230349 -> viable/strict/1762230349 2025-12-04T09:17:11.4118080Z * [new tag] viable/strict/1762231859 -> viable/strict/1762231859 2025-12-04T09:17:11.4119681Z * [new tag] viable/strict/1762233925 -> viable/strict/1762233925 2025-12-04T09:17:11.4121424Z * [new tag] viable/strict/1762237630 -> viable/strict/1762237630 2025-12-04T09:17:11.4122852Z * [new tag] viable/strict/1762253522 -> viable/strict/1762253522 2025-12-04T09:17:11.4124734Z * [new tag] viable/strict/1762278588 -> viable/strict/1762278588 2025-12-04T09:17:11.4126445Z * [new tag] viable/strict/1762284203 -> viable/strict/1762284203 2025-12-04T09:17:11.4128126Z * [new tag] viable/strict/1762289446 -> viable/strict/1762289446 2025-12-04T09:17:11.4129698Z * [new tag] viable/strict/1762291515 -> viable/strict/1762291515 2025-12-04T09:17:11.4131294Z * [new tag] viable/strict/1762295100 -> viable/strict/1762295100 2025-12-04T09:17:11.4132765Z * [new tag] viable/strict/1762296590 -> viable/strict/1762296590 2025-12-04T09:17:11.4134532Z * [new tag] viable/strict/1762300179 -> viable/strict/1762300179 2025-12-04T09:17:11.4135540Z * [new tag] viable/strict/1762303207 -> viable/strict/1762303207 2025-12-04T09:17:11.4137410Z * [new tag] viable/strict/1762386584 -> viable/strict/1762386584 2025-12-04T09:17:11.4138897Z * [new tag] viable/strict/1762391537 -> viable/strict/1762391537 2025-12-04T09:17:11.4140368Z * [new tag] viable/strict/1762394119 -> viable/strict/1762394119 2025-12-04T09:17:11.4142325Z * [new tag] viable/strict/1762397437 -> viable/strict/1762397437 2025-12-04T09:17:11.4144114Z * [new tag] viable/strict/1762400256 -> viable/strict/1762400256 2025-12-04T09:17:11.4145684Z * [new tag] viable/strict/1762401469 -> viable/strict/1762401469 2025-12-04T09:17:11.4147562Z * [new tag] viable/strict/1762408195 -> viable/strict/1762408195 2025-12-04T09:17:11.4149103Z * [new tag] viable/strict/1762410411 -> viable/strict/1762410411 2025-12-04T09:17:11.4150703Z * [new tag] viable/strict/1762417613 -> viable/strict/1762417613 2025-12-04T09:17:11.4152316Z * [new tag] viable/strict/1762419198 -> viable/strict/1762419198 2025-12-04T09:17:11.4154048Z * [new tag] viable/strict/1762422656 -> viable/strict/1762422656 2025-12-04T09:17:11.4155922Z * [new tag] viable/strict/1762424746 -> viable/strict/1762424746 2025-12-04T09:17:11.4157678Z * [new tag] viable/strict/1762446386 -> viable/strict/1762446386 2025-12-04T09:17:11.4159352Z * [new tag] viable/strict/1762449912 -> viable/strict/1762449912 2025-12-04T09:17:11.4160923Z * [new tag] viable/strict/1762457031 -> viable/strict/1762457031 2025-12-04T09:17:11.4162525Z * [new tag] viable/strict/1762462441 -> viable/strict/1762462441 2025-12-04T09:17:11.4164130Z * [new tag] viable/strict/1762467909 -> viable/strict/1762467909 2025-12-04T09:17:11.4165826Z * [new tag] viable/strict/1762471493 -> viable/strict/1762471493 2025-12-04T09:17:11.4167509Z * [new tag] viable/strict/1762475990 -> viable/strict/1762475990 2025-12-04T09:17:11.4169084Z * [new tag] viable/strict/1762477933 -> viable/strict/1762477933 2025-12-04T09:17:11.4170716Z * [new tag] viable/strict/1762491053 -> viable/strict/1762491053 2025-12-04T09:17:11.4172494Z * [new tag] viable/strict/1762493118 -> viable/strict/1762493118 2025-12-04T09:17:11.4174000Z * [new tag] viable/strict/1762498442 -> viable/strict/1762498442 2025-12-04T09:17:11.4175525Z * [new tag] viable/strict/1762501778 -> viable/strict/1762501778 2025-12-04T09:17:11.4177248Z * [new tag] viable/strict/1762504001 -> viable/strict/1762504001 2025-12-04T09:17:11.4178947Z * [new tag] viable/strict/1762505583 -> viable/strict/1762505583 2025-12-04T09:17:11.4180568Z * [new tag] viable/strict/1762507523 -> viable/strict/1762507523 2025-12-04T09:17:11.4182279Z * [new tag] viable/strict/1762511140 -> viable/strict/1762511140 2025-12-04T09:17:11.4184115Z * [new tag] viable/strict/1762512632 -> viable/strict/1762512632 2025-12-04T09:17:11.4185744Z * [new tag] viable/strict/1762520467 -> viable/strict/1762520467 2025-12-04T09:17:11.4187420Z * [new tag] viable/strict/1762522016 -> viable/strict/1762522016 2025-12-04T09:17:11.4188907Z * [new tag] viable/strict/1762530591 -> viable/strict/1762530591 2025-12-04T09:17:11.4190619Z * [new tag] viable/strict/1762543405 -> viable/strict/1762543405 2025-12-04T09:17:11.4192089Z * [new tag] viable/strict/1762544998 -> viable/strict/1762544998 2025-12-04T09:17:11.4193561Z * [new tag] viable/strict/1762552182 -> viable/strict/1762552182 2025-12-04T09:17:11.4195214Z * [new tag] viable/strict/1762554297 -> viable/strict/1762554297 2025-12-04T09:17:11.4196788Z * [new tag] viable/strict/1762559381 -> viable/strict/1762559381 2025-12-04T09:17:11.4198877Z * [new tag] viable/strict/1762562222 -> viable/strict/1762562222 2025-12-04T09:17:11.4200422Z * [new tag] viable/strict/1762564319 -> viable/strict/1762564319 2025-12-04T09:17:11.4201851Z * [new tag] viable/strict/1762566904 -> viable/strict/1762566904 2025-12-04T09:17:11.4203447Z * [new tag] viable/strict/1762569781 -> viable/strict/1762569781 2025-12-04T09:17:11.4205036Z * [new tag] viable/strict/1762575940 -> viable/strict/1762575940 2025-12-04T09:17:11.4206817Z * [new tag] viable/strict/1762580974 -> viable/strict/1762580974 2025-12-04T09:17:11.4209573Z * [new tag] viable/strict/1762583185 -> viable/strict/1762583185 2025-12-04T09:17:11.4210173Z * [new tag] viable/strict/1762586647 -> viable/strict/1762586647 2025-12-04T09:17:11.4211731Z * [new tag] viable/strict/1762588183 -> viable/strict/1762588183 2025-12-04T09:17:11.4213183Z * [new tag] viable/strict/1762593886 -> viable/strict/1762593886 2025-12-04T09:17:11.4214795Z * [new tag] viable/strict/1762650743 -> viable/strict/1762650743 2025-12-04T09:17:11.4216538Z * [new tag] viable/strict/1762653328 -> viable/strict/1762653328 2025-12-04T09:17:11.4218161Z * [new tag] viable/strict/1762659342 -> viable/strict/1762659342 2025-12-04T09:17:11.4219837Z * [new tag] viable/strict/1762662360 -> viable/strict/1762662360 2025-12-04T09:17:11.4221496Z * [new tag] viable/strict/1762667377 -> viable/strict/1762667377 2025-12-04T09:17:11.4223143Z * [new tag] viable/strict/1762671090 -> viable/strict/1762671090 2025-12-04T09:17:11.4225061Z * [new tag] viable/strict/1762680284 -> viable/strict/1762680284 2025-12-04T09:17:11.4226802Z * [new tag] viable/strict/1762683900 -> viable/strict/1762683900 2025-12-04T09:17:11.4228400Z * [new tag] viable/strict/1762705541 -> viable/strict/1762705541 2025-12-04T09:17:11.4229962Z * [new tag] viable/strict/1762709004 -> viable/strict/1762709004 2025-12-04T09:17:11.4231867Z * [new tag] viable/strict/1762746004 -> viable/strict/1762746004 2025-12-04T09:17:11.4233486Z * [new tag] viable/strict/1762748799 -> viable/strict/1762748799 2025-12-04T09:17:11.4235101Z * [new tag] viable/strict/1762759504 -> viable/strict/1762759504 2025-12-04T09:17:11.4236886Z * [new tag] viable/strict/1762760973 -> viable/strict/1762760973 2025-12-04T09:17:11.4238512Z * [new tag] viable/strict/1762775374 -> viable/strict/1762775374 2025-12-04T09:17:11.4240182Z * [new tag] viable/strict/1762777661 -> viable/strict/1762777661 2025-12-04T09:17:11.4241843Z * [new tag] viable/strict/1762779774 -> viable/strict/1762779774 2025-12-04T09:17:11.4243597Z * [new tag] viable/strict/1762781259 -> viable/strict/1762781259 2025-12-04T09:17:11.4245317Z * [new tag] viable/strict/1762793628 -> viable/strict/1762793628 2025-12-04T09:17:11.4247039Z * [new tag] viable/strict/1762800711 -> viable/strict/1762800711 2025-12-04T09:17:11.4248816Z * [new tag] viable/strict/1762809894 -> viable/strict/1762809894 2025-12-04T09:17:11.4250378Z * [new tag] viable/strict/1762811384 -> viable/strict/1762811384 2025-12-04T09:17:11.4252115Z * [new tag] viable/strict/1762813841 -> viable/strict/1762813841 2025-12-04T09:17:11.4253787Z * [new tag] viable/strict/1762815047 -> viable/strict/1762815047 2025-12-04T09:17:11.4255505Z * [new tag] viable/strict/1762817094 -> viable/strict/1762817094 2025-12-04T09:17:11.4257211Z * [new tag] viable/strict/1762818582 -> viable/strict/1762818582 2025-12-04T09:17:11.4258813Z * [new tag] viable/strict/1762821623 -> viable/strict/1762821623 2025-12-04T09:17:11.4260308Z * [new tag] viable/strict/1762823531 -> viable/strict/1762823531 2025-12-04T09:17:11.4262153Z * [new tag] viable/strict/1762849583 -> viable/strict/1762849583 2025-12-04T09:17:11.4263909Z * [new tag] viable/strict/1762851200 -> viable/strict/1762851200 2025-12-04T09:17:11.4265519Z * [new tag] viable/strict/1762854603 -> viable/strict/1762854603 2025-12-04T09:17:11.4267262Z * [new tag] viable/strict/1762858276 -> viable/strict/1762858276 2025-12-04T09:17:11.4268838Z * [new tag] viable/strict/1762860891 -> viable/strict/1762860891 2025-12-04T09:17:11.4271135Z * [new tag] viable/strict/1762866174 -> viable/strict/1762866174 2025-12-04T09:17:11.4272775Z * [new tag] viable/strict/1762867653 -> viable/strict/1762867653 2025-12-04T09:17:11.4274353Z * [new tag] viable/strict/1762872669 -> viable/strict/1762872669 2025-12-04T09:17:11.4275787Z * [new tag] viable/strict/1762878380 -> viable/strict/1762878380 2025-12-04T09:17:11.4277482Z * [new tag] viable/strict/1762889003 -> viable/strict/1762889003 2025-12-04T09:17:11.4279136Z * [new tag] viable/strict/1762890589 -> viable/strict/1762890589 2025-12-04T09:17:11.4280866Z * [new tag] viable/strict/1762892743 -> viable/strict/1762892743 2025-12-04T09:17:11.4282598Z * [new tag] viable/strict/1762894271 -> viable/strict/1762894271 2025-12-04T09:17:11.4284096Z * [new tag] viable/strict/1762896287 -> viable/strict/1762896287 2025-12-04T09:17:11.4285669Z * [new tag] viable/strict/1762915871 -> viable/strict/1762915871 2025-12-04T09:17:11.4287331Z * [new tag] viable/strict/1762918569 -> viable/strict/1762918569 2025-12-04T09:17:11.4288804Z * [new tag] viable/strict/1762919776 -> viable/strict/1762919776 2025-12-04T09:17:11.4290545Z * [new tag] viable/strict/1762923072 -> viable/strict/1762923072 2025-12-04T09:17:11.4292196Z * [new tag] viable/strict/1762928826 -> viable/strict/1762928826 2025-12-04T09:17:11.4293944Z * [new tag] viable/strict/1762930451 -> viable/strict/1762930451 2025-12-04T09:17:11.4295542Z * [new tag] viable/strict/1762933780 -> viable/strict/1762933780 2025-12-04T09:17:11.4297218Z * [new tag] viable/strict/1762937638 -> viable/strict/1762937638 2025-12-04T09:17:11.4298960Z * [new tag] viable/strict/1762939545 -> viable/strict/1762939545 2025-12-04T09:17:11.4300612Z * [new tag] viable/strict/1762962692 -> viable/strict/1762962692 2025-12-04T09:17:11.4302344Z * [new tag] viable/strict/1762979143 -> viable/strict/1762979143 2025-12-04T09:17:11.4304190Z * [new tag] viable/strict/1762984188 -> viable/strict/1762984188 2025-12-04T09:17:11.4306141Z * [new tag] viable/strict/1762986306 -> viable/strict/1762986306 2025-12-04T09:17:11.4307861Z * [new tag] viable/strict/1762989903 -> viable/strict/1762989903 2025-12-04T09:17:11.4309544Z * [new tag] viable/strict/1762991377 -> viable/strict/1762991377 2025-12-04T09:17:11.4311595Z * [new tag] viable/strict/1762998921 -> viable/strict/1762998921 2025-12-04T09:17:11.4313545Z * [new tag] viable/strict/1763002287 -> viable/strict/1763002287 2025-12-04T09:17:11.4315186Z * [new tag] viable/strict/1763016840 -> viable/strict/1763016840 2025-12-04T09:17:11.4316840Z * [new tag] viable/strict/1763020180 -> viable/strict/1763020180 2025-12-04T09:17:11.4318529Z * [new tag] viable/strict/1763027421 -> viable/strict/1763027421 2025-12-04T09:17:11.4320074Z * [new tag] viable/strict/1763031120 -> viable/strict/1763031120 2025-12-04T09:17:11.4321755Z * [new tag] viable/strict/1763036861 -> viable/strict/1763036861 2025-12-04T09:17:11.4323372Z * [new tag] viable/strict/1763038993 -> viable/strict/1763038993 2025-12-04T09:17:11.4325156Z * [new tag] viable/strict/1763054703 -> viable/strict/1763054703 2025-12-04T09:17:11.4328791Z * [new tag] viable/strict/1763067061 -> viable/strict/1763067061 2025-12-04T09:17:11.4330456Z * [new tag] viable/strict/1763070847 -> viable/strict/1763070847 2025-12-04T09:17:11.4332094Z * [new tag] viable/strict/1763072706 -> viable/strict/1763072706 2025-12-04T09:17:11.4333706Z * [new tag] viable/strict/1763076302 -> viable/strict/1763076302 2025-12-04T09:17:11.4335412Z * [new tag] viable/strict/1763080816 -> viable/strict/1763080816 2025-12-04T09:17:11.4337030Z * [new tag] viable/strict/1763082732 -> viable/strict/1763082732 2025-12-04T09:17:11.4338624Z * [new tag] viable/strict/1763085329 -> viable/strict/1763085329 2025-12-04T09:17:11.4340211Z * [new tag] viable/strict/1763088623 -> viable/strict/1763088623 2025-12-04T09:17:11.4341914Z * [new tag] viable/strict/1763091402 -> viable/strict/1763091402 2025-12-04T09:17:11.4343664Z * [new tag] viable/strict/1763092602 -> viable/strict/1763092602 2025-12-04T09:17:11.4345379Z * [new tag] viable/strict/1763094355 -> viable/strict/1763094355 2025-12-04T09:17:11.4346915Z * [new tag] viable/strict/1763099390 -> viable/strict/1763099390 2025-12-04T09:17:11.4348649Z * [new tag] viable/strict/1763101608 -> viable/strict/1763101608 2025-12-04T09:17:11.4350321Z * [new tag] viable/strict/1763105102 -> viable/strict/1763105102 2025-12-04T09:17:11.4352008Z * [new tag] viable/strict/1763112347 -> viable/strict/1763112347 2025-12-04T09:17:11.4353603Z * [new tag] viable/strict/1763119471 -> viable/strict/1763119471 2025-12-04T09:17:11.4355154Z * [new tag] viable/strict/1763126835 -> viable/strict/1763126835 2025-12-04T09:17:11.4356532Z * [new tag] viable/strict/1763149779 -> viable/strict/1763149779 2025-12-04T09:17:11.4358201Z * [new tag] viable/strict/1763164178 -> viable/strict/1763164178 2025-12-04T09:17:11.4359833Z * [new tag] viable/strict/1763167104 -> viable/strict/1763167104 2025-12-04T09:17:11.4361401Z * [new tag] viable/strict/1763169132 -> viable/strict/1763169132 2025-12-04T09:17:11.4363011Z * [new tag] viable/strict/1763171708 -> viable/strict/1763171708 2025-12-04T09:17:11.4364658Z * [new tag] viable/strict/1763174759 -> viable/strict/1763174759 2025-12-04T09:17:11.4366262Z * [new tag] viable/strict/1763180744 -> viable/strict/1763180744 2025-12-04T09:17:11.4367928Z * [new tag] viable/strict/1763182227 -> viable/strict/1763182227 2025-12-04T09:17:11.4369466Z * [new tag] viable/strict/1763184309 -> viable/strict/1763184309 2025-12-04T09:17:11.4371536Z * [new tag] viable/strict/1763187991 -> viable/strict/1763187991 2025-12-04T09:17:11.4373126Z * [new tag] viable/strict/1763191445 -> viable/strict/1763191445 2025-12-04T09:17:11.4374978Z * [new tag] viable/strict/1763195152 -> viable/strict/1763195152 2025-12-04T09:17:11.4376426Z * [new tag] viable/strict/1763205769 -> viable/strict/1763205769 2025-12-04T09:17:11.4378045Z * [new tag] viable/strict/1763246990 -> viable/strict/1763246990 2025-12-04T09:17:11.4379706Z * [new tag] viable/strict/1763261578 -> viable/strict/1763261578 2025-12-04T09:17:11.4381225Z * [new tag] viable/strict/1763286573 -> viable/strict/1763286573 2025-12-04T09:17:11.4382819Z * [new tag] viable/strict/1763292167 -> viable/strict/1763292167 2025-12-04T09:17:11.4384588Z * [new tag] viable/strict/1763333386 -> viable/strict/1763333386 2025-12-04T09:17:11.4386155Z * [new tag] viable/strict/1763340082 -> viable/strict/1763340082 2025-12-04T09:17:11.4388484Z * [new tag] viable/strict/1763364324 -> viable/strict/1763364324 2025-12-04T09:17:11.4390035Z * [new tag] viable/strict/1763371569 -> viable/strict/1763371569 2025-12-04T09:17:11.4391732Z * [new tag] viable/strict/1763373067 -> viable/strict/1763373067 2025-12-04T09:17:11.4393276Z * [new tag] viable/strict/1763375157 -> viable/strict/1763375157 2025-12-04T09:17:11.4394895Z * [new tag] viable/strict/1763382462 -> viable/strict/1763382462 2025-12-04T09:17:11.4396589Z * [new tag] viable/strict/1763394661 -> viable/strict/1763394661 2025-12-04T09:17:11.4398310Z * [new tag] viable/strict/1763396797 -> viable/strict/1763396797 2025-12-04T09:17:11.4400017Z * [new tag] viable/strict/1763398542 -> viable/strict/1763398542 2025-12-04T09:17:11.4401647Z * [new tag] viable/strict/1763401807 -> viable/strict/1763401807 2025-12-04T09:17:11.4403163Z * [new tag] viable/strict/1763414698 -> viable/strict/1763414698 2025-12-04T09:17:11.4404781Z * [new tag] viable/strict/1763419807 -> viable/strict/1763419807 2025-12-04T09:17:11.4406460Z * [new tag] viable/strict/1763426369 -> viable/strict/1763426369 2025-12-04T09:17:11.4408038Z * [new tag] viable/strict/1763428331 -> viable/strict/1763428331 2025-12-04T09:17:11.4409695Z * [new tag] viable/strict/1763430922 -> viable/strict/1763430922 2025-12-04T09:17:11.4411285Z * [new tag] viable/strict/1763434184 -> viable/strict/1763434184 2025-12-04T09:17:11.4412894Z * [new tag] viable/strict/1763439973 -> viable/strict/1763439973 2025-12-04T09:17:11.4415074Z * [new tag] viable/strict/1763444995 -> viable/strict/1763444995 2025-12-04T09:17:11.4416617Z * [new tag] viable/strict/1763447206 -> viable/strict/1763447206 2025-12-04T09:17:11.4418266Z * [new tag] viable/strict/1763448826 -> viable/strict/1763448826 2025-12-04T09:17:11.4420030Z * [new tag] viable/strict/1763450717 -> viable/strict/1763450717 2025-12-04T09:17:11.4421577Z * [new tag] viable/strict/1763452183 -> viable/strict/1763452183 2025-12-04T09:17:11.4423430Z * [new tag] viable/strict/1763457945 -> viable/strict/1763457945 2025-12-04T09:17:11.4424950Z * [new tag] viable/strict/1763459439 -> viable/strict/1763459439 2025-12-04T09:17:11.4426709Z * [new tag] viable/strict/1763461556 -> viable/strict/1763461556 2025-12-04T09:17:11.4428238Z * [new tag] viable/strict/1763463103 -> viable/strict/1763463103 2025-12-04T09:17:11.4429917Z * [new tag] viable/strict/1763465100 -> viable/strict/1763465100 2025-12-04T09:17:11.4431331Z * [new tag] viable/strict/1763468866 -> viable/strict/1763468866 2025-12-04T09:17:11.4432819Z * [new tag] viable/strict/1763493823 -> viable/strict/1763493823 2025-12-04T09:17:11.4434337Z * [new tag] viable/strict/1763496249 -> viable/strict/1763496249 2025-12-04T09:17:11.4435891Z * [new tag] viable/strict/1763502620 -> viable/strict/1763502620 2025-12-04T09:17:11.4437631Z * [new tag] viable/strict/1763504715 -> viable/strict/1763504715 2025-12-04T09:17:11.4439215Z * [new tag] viable/strict/1763506208 -> viable/strict/1763506208 2025-12-04T09:17:11.4440848Z * [new tag] viable/strict/1763520590 -> viable/strict/1763520590 2025-12-04T09:17:11.4442491Z * [new tag] viable/strict/1763523357 -> viable/strict/1763523357 2025-12-04T09:17:11.4444129Z * [new tag] viable/strict/1763529922 -> viable/strict/1763529922 2025-12-04T09:17:11.4445861Z * [new tag] viable/strict/1763531408 -> viable/strict/1763531408 2025-12-04T09:17:11.4447418Z * [new tag] viable/strict/1763533622 -> viable/strict/1763533622 2025-12-04T09:17:11.4449133Z * [new tag] viable/strict/1763538576 -> viable/strict/1763538576 2025-12-04T09:17:11.4450681Z * [new tag] viable/strict/1763545823 -> viable/strict/1763545823 2025-12-04T09:17:11.4452140Z * [new tag] viable/strict/1763547951 -> viable/strict/1763547951 2025-12-04T09:17:11.4453891Z * [new tag] viable/strict/1763551477 -> viable/strict/1763551477 2025-12-04T09:17:11.4455475Z * [new tag] viable/strict/1763552982 -> viable/strict/1763552982 2025-12-04T09:17:11.4457024Z * [new tag] viable/strict/1763594698 -> viable/strict/1763594698 2025-12-04T09:17:11.4458813Z * [new tag] viable/strict/1763596178 -> viable/strict/1763596178 2025-12-04T09:17:11.4460331Z * [new tag] viable/strict/1763599155 -> viable/strict/1763599155 2025-12-04T09:17:11.4461918Z * [new tag] viable/strict/1763603717 -> viable/strict/1763603717 2025-12-04T09:17:11.4463903Z * [new tag] viable/strict/1763606923 -> viable/strict/1763606923 2025-12-04T09:17:11.4465485Z * [new tag] viable/strict/1763609715 -> viable/strict/1763609715 2025-12-04T09:17:11.4467055Z * [new tag] viable/strict/1763612757 -> viable/strict/1763612757 2025-12-04T09:17:11.4468703Z * [new tag] viable/strict/1763616325 -> viable/strict/1763616325 2025-12-04T09:17:11.4470219Z * [new tag] viable/strict/1763623509 -> viable/strict/1763623509 2025-12-04T09:17:11.4471943Z * [new tag] viable/strict/1763624984 -> viable/strict/1763624984 2025-12-04T09:17:11.4473718Z * [new tag] viable/strict/1763628796 -> viable/strict/1763628796 2025-12-04T09:17:11.4475120Z * [new tag] viable/strict/1763634343 -> viable/strict/1763634343 2025-12-04T09:17:11.4476689Z * [new tag] viable/strict/1763635867 -> viable/strict/1763635867 2025-12-04T09:17:11.4478646Z * [new tag] viable/strict/1763639382 -> viable/strict/1763639382 2025-12-04T09:17:11.4480130Z * [new tag] viable/strict/1763646626 -> viable/strict/1763646626 2025-12-04T09:17:11.4481913Z * [new tag] viable/strict/1763655997 -> viable/strict/1763655997 2025-12-04T09:17:11.4483504Z * [new tag] viable/strict/1763659444 -> viable/strict/1763659444 2025-12-04T09:17:11.4485088Z * [new tag] viable/strict/1763660992 -> viable/strict/1763660992 2025-12-04T09:17:11.4486674Z * [new tag] viable/strict/1763663201 -> viable/strict/1763663201 2025-12-04T09:17:11.4488340Z * [new tag] viable/strict/1763670362 -> viable/strict/1763670362 2025-12-04T09:17:11.4489858Z * [new tag] viable/strict/1763675378 -> viable/strict/1763675378 2025-12-04T09:17:11.4491425Z * [new tag] viable/strict/1763693343 -> viable/strict/1763693343 2025-12-04T09:17:11.4492966Z * [new tag] viable/strict/1763696088 -> viable/strict/1763696088 2025-12-04T09:17:11.4494708Z * [new tag] viable/strict/1763697343 -> viable/strict/1763697343 2025-12-04T09:17:11.4496297Z * [new tag] viable/strict/1763699165 -> viable/strict/1763699165 2025-12-04T09:17:11.4497929Z * [new tag] viable/strict/1763700660 -> viable/strict/1763700660 2025-12-04T09:17:11.4499513Z * [new tag] viable/strict/1763704209 -> viable/strict/1763704209 2025-12-04T09:17:11.4501142Z * [new tag] viable/strict/1763706411 -> viable/strict/1763706411 2025-12-04T09:17:11.4502830Z * [new tag] viable/strict/1763708082 -> viable/strict/1763708082 2025-12-04T09:17:11.4504305Z * [new tag] viable/strict/1763711381 -> viable/strict/1763711381 2025-12-04T09:17:11.4505811Z * [new tag] viable/strict/1763713593 -> viable/strict/1763713593 2025-12-04T09:17:11.4507504Z * [new tag] viable/strict/1763715201 -> viable/strict/1763715201 2025-12-04T09:17:11.4509073Z * [new tag] viable/strict/1763733017 -> viable/strict/1763733017 2025-12-04T09:17:11.4510737Z * [new tag] viable/strict/1763735108 -> viable/strict/1763735108 2025-12-04T09:17:11.4512271Z * [new tag] viable/strict/1763749579 -> viable/strict/1763749579 2025-12-04T09:17:11.4513998Z * [new tag] viable/strict/1763751113 -> viable/strict/1763751113 2025-12-04T09:17:11.4515650Z * [new tag] viable/strict/1763753035 -> viable/strict/1763753035 2025-12-04T09:17:11.4517396Z * [new tag] viable/strict/1763754578 -> viable/strict/1763754578 2025-12-04T09:17:11.4519377Z * [new tag] viable/strict/1763756748 -> viable/strict/1763756748 2025-12-04T09:17:11.4521004Z * [new tag] viable/strict/1763758205 -> viable/strict/1763758205 2025-12-04T09:17:11.4522435Z * [new tag] viable/strict/1763764050 -> viable/strict/1763764050 2025-12-04T09:17:11.4524056Z * [new tag] viable/strict/1763771887 -> viable/strict/1763771887 2025-12-04T09:17:11.4526008Z * [new tag] viable/strict/1763773920 -> viable/strict/1763773920 2025-12-04T09:17:11.4527573Z * [new tag] viable/strict/1763776501 -> viable/strict/1763776501 2025-12-04T09:17:11.4529157Z * [new tag] viable/strict/1763779437 -> viable/strict/1763779437 2025-12-04T09:17:11.4530967Z * [new tag] viable/strict/1763781038 -> viable/strict/1763781038 2025-12-04T09:17:11.4532576Z * [new tag] viable/strict/1763782245 -> viable/strict/1763782245 2025-12-04T09:17:11.4534104Z * [new tag] viable/strict/1763785568 -> viable/strict/1763785568 2025-12-04T09:17:11.4535665Z * [new tag] viable/strict/1763787006 -> viable/strict/1763787006 2025-12-04T09:17:11.4537374Z * [new tag] viable/strict/1763789103 -> viable/strict/1763789103 2025-12-04T09:17:11.4538968Z * [new tag] viable/strict/1763790578 -> viable/strict/1763790578 2025-12-04T09:17:11.4540523Z * [new tag] viable/strict/1763796275 -> viable/strict/1763796275 2025-12-04T09:17:11.4542445Z * [new tag] viable/strict/1763801465 -> viable/strict/1763801465 2025-12-04T09:17:11.4544067Z * [new tag] viable/strict/1763803522 -> viable/strict/1763803522 2025-12-04T09:17:11.4545726Z * [new tag] viable/strict/1763808581 -> viable/strict/1763808581 2025-12-04T09:17:11.4547278Z * [new tag] viable/strict/1763840977 -> viable/strict/1763840977 2025-12-04T09:17:11.4548914Z * [new tag] viable/strict/1763846659 -> viable/strict/1763846659 2025-12-04T09:17:11.4550500Z * [new tag] viable/strict/1763872065 -> viable/strict/1763872065 2025-12-04T09:17:11.4552181Z * [new tag] viable/strict/1763873648 -> viable/strict/1763873648 2025-12-04T09:17:11.4553754Z * [new tag] viable/strict/1763875506 -> viable/strict/1763875506 2025-12-04T09:17:11.4555227Z * [new tag] viable/strict/1763889904 -> viable/strict/1763889904 2025-12-04T09:17:11.4556854Z * [new tag] viable/strict/1763930999 -> viable/strict/1763930999 2025-12-04T09:17:11.4558423Z * [new tag] viable/strict/1763944964 -> viable/strict/1763944964 2025-12-04T09:17:11.4559852Z * [new tag] viable/strict/1763958474 -> viable/strict/1763958474 2025-12-04T09:17:11.4561507Z * [new tag] viable/strict/1763967263 -> viable/strict/1763967263 2025-12-04T09:17:11.4563071Z * [new tag] viable/strict/1763972803 -> viable/strict/1763972803 2025-12-04T09:17:11.4564576Z * [new tag] viable/strict/1763976376 -> viable/strict/1763976376 2025-12-04T09:17:11.4566232Z * [new tag] viable/strict/1763989404 -> viable/strict/1763989404 2025-12-04T09:17:11.4567967Z * [new tag] viable/strict/1763990887 -> viable/strict/1763990887 2025-12-04T09:17:11.4569546Z * [new tag] viable/strict/1764019919 -> viable/strict/1764019919 2025-12-04T09:17:11.4571188Z * [new tag] viable/strict/1764023134 -> viable/strict/1764023134 2025-12-04T09:17:11.4572648Z * [new tag] viable/strict/1764024593 -> viable/strict/1764024593 2025-12-04T09:17:11.4574204Z * [new tag] viable/strict/1764026706 -> viable/strict/1764026706 2025-12-04T09:17:11.4576065Z * [new tag] viable/strict/1764031139 -> viable/strict/1764031139 2025-12-04T09:17:11.4577706Z * [new tag] viable/strict/1764033131 -> viable/strict/1764033131 2025-12-04T09:17:11.4579115Z * [new tag] viable/strict/1764035725 -> viable/strict/1764035725 2025-12-04T09:17:11.4580578Z * [new tag] viable/strict/1764624265 -> viable/strict/1764624265 2025-12-04T09:17:11.4582013Z * [new tag] viable/strict/1764631514 -> viable/strict/1764631514 2025-12-04T09:17:11.4583617Z * [new tag] viable/strict/1764632987 -> viable/strict/1764632987 2025-12-04T09:17:11.4584975Z * [new tag] viable/strict/1764636063 -> viable/strict/1764636063 2025-12-04T09:17:11.4586414Z * [new tag] viable/strict/1764643975 -> viable/strict/1764643975 2025-12-04T09:17:11.4587819Z * [new tag] viable/strict/1764646859 -> viable/strict/1764646859 2025-12-04T09:17:11.4589358Z * [new tag] viable/strict/1764653120 -> viable/strict/1764653120 2025-12-04T09:17:11.4590728Z * [new tag] viable/strict/1764654632 -> viable/strict/1764654632 2025-12-04T09:17:11.4592080Z * [new tag] viable/strict/1764656821 -> viable/strict/1764656821 2025-12-04T09:17:11.4593571Z * [new tag] viable/strict/1764658557 -> viable/strict/1764658557 2025-12-04T09:17:11.4595042Z * [new tag] viable/strict/1764660333 -> viable/strict/1764660333 2025-12-04T09:17:11.4596426Z * [new tag] viable/strict/1764661812 -> viable/strict/1764661812 2025-12-04T09:17:11.4597896Z * [new tag] viable/strict/1764664023 -> viable/strict/1764664023 2025-12-04T09:17:11.4599278Z * [new tag] viable/strict/1764669150 -> viable/strict/1764669150 2025-12-04T09:17:11.4600728Z * [new tag] viable/strict/1764680709 -> viable/strict/1764680709 2025-12-04T09:17:11.4602107Z * [new tag] viable/strict/1764687619 -> viable/strict/1764687619 2025-12-04T09:17:11.4603533Z * [new tag] viable/strict/1764696355 -> viable/strict/1764696355 2025-12-04T09:17:11.4604951Z * [new tag] viable/strict/1764701767 -> viable/strict/1764701767 2025-12-04T09:17:11.4606364Z * [new tag] viable/strict/1764710768 -> viable/strict/1764710768 2025-12-04T09:17:11.4607847Z * [new tag] viable/strict/1764716202 -> viable/strict/1764716202 2025-12-04T09:17:11.4609301Z * [new tag] viable/strict/1764793566 -> viable/strict/1764793566 2025-12-04T09:17:11.4610702Z * [new tag] viable/strict/1764797093 -> viable/strict/1764797093 2025-12-04T09:17:11.4612130Z * [new tag] viable/strict/1764800729 -> viable/strict/1764800729 2025-12-04T09:17:11.4613775Z * [new tag] whc_flight_1 -> whc_flight_1 2025-12-04T09:17:11.4615421Z * [new tag] whc_flight_2 -> whc_flight_2 2025-12-04T09:17:11.4617110Z * [new tag] whc_flight_4 -> whc_flight_4 2025-12-04T09:17:11.5798809Z [command]/usr/bin/git rev-parse --verify --quiet ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32^{object} 2025-12-04T09:17:11.5831625Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:17:11.5836533Z ##[endgroup] 2025-12-04T09:17:11.5836827Z ##[group]Determining the checkout info 2025-12-04T09:17:11.5838361Z ##[endgroup] 2025-12-04T09:17:11.5842696Z [command]/usr/bin/git sparse-checkout disable 2025-12-04T09:17:11.5886228Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-12-04T09:17:11.5921592Z ##[group]Checking out the ref 2025-12-04T09:17:11.5924651Z [command]/usr/bin/git checkout --progress --force ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:17:12.6211389Z Updating files: 65% (13206/20121) 2025-12-04T09:17:12.6292692Z Updating files: 66% (13280/20121) 2025-12-04T09:17:12.6376272Z Updating files: 67% (13482/20121) 2025-12-04T09:17:12.6458273Z Updating files: 68% (13683/20121) 2025-12-04T09:17:12.6670144Z Updating files: 69% (13884/20121) 2025-12-04T09:17:12.7001251Z Updating files: 70% (14085/20121) 2025-12-04T09:17:12.7070272Z Updating files: 71% (14286/20121) 2025-12-04T09:17:12.7165455Z Updating files: 72% (14488/20121) 2025-12-04T09:17:12.7384569Z Updating files: 73% (14689/20121) 2025-12-04T09:17:12.7659310Z Updating files: 74% (14890/20121) 2025-12-04T09:17:12.8225930Z Updating files: 75% (15091/20121) 2025-12-04T09:17:12.8403239Z Updating files: 76% (15292/20121) 2025-12-04T09:17:12.8573259Z Updating files: 77% (15494/20121) 2025-12-04T09:17:12.8815797Z Updating files: 78% (15695/20121) 2025-12-04T09:17:12.9115331Z Updating files: 79% (15896/20121) 2025-12-04T09:17:12.9476866Z Updating files: 80% (16097/20121) 2025-12-04T09:17:12.9803874Z Updating files: 81% (16299/20121) 2025-12-04T09:17:13.0063004Z Updating files: 82% (16500/20121) 2025-12-04T09:17:13.0254387Z Updating files: 83% (16701/20121) 2025-12-04T09:17:13.0429399Z Updating files: 84% (16902/20121) 2025-12-04T09:17:13.0624998Z Updating files: 85% (17103/20121) 2025-12-04T09:17:13.0818100Z Updating files: 86% (17305/20121) 2025-12-04T09:17:13.0995752Z Updating files: 87% (17506/20121) 2025-12-04T09:17:13.1145441Z Updating files: 88% (17707/20121) 2025-12-04T09:17:13.1320814Z Updating files: 89% (17908/20121) 2025-12-04T09:17:13.1532011Z Updating files: 90% (18109/20121) 2025-12-04T09:17:13.1684569Z Updating files: 91% (18311/20121) 2025-12-04T09:17:13.1877434Z Updating files: 92% (18512/20121) 2025-12-04T09:17:13.2099941Z Updating files: 93% (18713/20121) 2025-12-04T09:17:13.2343052Z Updating files: 94% (18914/20121) 2025-12-04T09:17:13.2555790Z Updating files: 95% (19115/20121) 2025-12-04T09:17:13.2750830Z Updating files: 96% (19317/20121) 2025-12-04T09:17:13.2953852Z Updating files: 97% (19518/20121) 2025-12-04T09:17:13.3282301Z Updating files: 98% (19719/20121) 2025-12-04T09:17:13.3496322Z Updating files: 99% (19920/20121) 2025-12-04T09:17:13.3497134Z Updating files: 100% (20121/20121) 2025-12-04T09:17:13.3497925Z Updating files: 100% (20121/20121), done. 2025-12-04T09:17:13.3785101Z Note: switching to 'ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32'. 2025-12-04T09:17:13.3785509Z 2025-12-04T09:17:13.3785737Z You are in 'detached HEAD' state. You can look around, make experimental 2025-12-04T09:17:13.3786271Z changes and commit them, and you can discard any commits you make in this 2025-12-04T09:17:13.3786813Z state without impacting any branches by switching back to a branch. 2025-12-04T09:17:13.3787130Z 2025-12-04T09:17:13.3787346Z If you want to create a new branch to retain commits you create, you may 2025-12-04T09:17:13.3787837Z do so (now or later) by using -c with the switch command. Example: 2025-12-04T09:17:13.3788132Z 2025-12-04T09:17:13.3788252Z git switch -c 2025-12-04T09:17:13.3788454Z 2025-12-04T09:17:13.3788576Z Or undo this operation with: 2025-12-04T09:17:13.3788757Z 2025-12-04T09:17:13.3788853Z git switch - 2025-12-04T09:17:13.3788991Z 2025-12-04T09:17:13.3789236Z Turn off this advice by setting config variable advice.detachedHead to false 2025-12-04T09:17:13.3789585Z 2025-12-04T09:17:13.3791839Z HEAD is now at ffd9b0fb435 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T09:17:13.3974110Z ##[endgroup] 2025-12-04T09:17:13.3974559Z ##[group]Setting up auth for fetching submodules 2025-12-04T09:17:13.3980414Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T09:17:13.4038068Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-12-04T09:17:13.4072931Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-12-04T09:17:13.4108865Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-12-04T09:17:13.4141322Z ##[endgroup] 2025-12-04T09:17:13.4141745Z ##[group]Fetching submodules 2025-12-04T09:17:13.4144206Z [command]/usr/bin/git submodule sync --recursive 2025-12-04T09:17:13.4555280Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-12-04T09:17:13.4955094Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2025-12-04T09:17:13.4956489Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2025-12-04T09:17:13.4960420Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2025-12-04T09:17:13.4964144Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2025-12-04T09:17:13.4967930Z Submodule 'third_party/NVTX' (https://github.com/NVIDIA/NVTX.git) registered for path 'third_party/NVTX' 2025-12-04T09:17:13.4972517Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2025-12-04T09:17:13.4976060Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2025-12-04T09:17:13.4980288Z Submodule 'third_party/aiter' (https://github.com/ROCm/aiter.git) registered for path 'third_party/aiter' 2025-12-04T09:17:13.4986216Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2025-12-04T09:17:13.4991665Z Submodule 'third_party/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/composable_kernel' 2025-12-04T09:17:13.4996391Z Submodule 'third_party/cpp-httplib' (https://github.com/yhirose/cpp-httplib.git) registered for path 'third_party/cpp-httplib' 2025-12-04T09:17:13.5000734Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2025-12-04T09:17:13.5005286Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2025-12-04T09:17:13.5009860Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2025-12-04T09:17:13.5015595Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2025-12-04T09:17:13.5020468Z Submodule 'third_party/flash-attention' (https://github.com/Dao-AILab/flash-attention.git) registered for path 'third_party/flash-attention' 2025-12-04T09:17:13.5026372Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2025-12-04T09:17:13.5032393Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2025-12-04T09:17:13.5037445Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2025-12-04T09:17:13.5042324Z Submodule 'third_party/gloo' (https://github.com/pytorch/gloo) registered for path 'third_party/gloo' 2025-12-04T09:17:13.5047671Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2025-12-04T09:17:13.5054361Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2025-12-04T09:17:13.5058774Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2025-12-04T09:17:13.5064402Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2025-12-04T09:17:13.5070227Z Submodule 'third_party/kleidiai' (https://github.com/ARM-software/kleidiai.git) registered for path 'third_party/kleidiai' 2025-12-04T09:17:13.5076058Z Submodule 'third_party/mimalloc' (https://github.com/microsoft/mimalloc.git) registered for path 'third_party/mimalloc' 2025-12-04T09:17:13.5082026Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2025-12-04T09:17:13.5088010Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2025-12-04T09:17:13.5094206Z Submodule 'third_party/opentelemetry-cpp' (https://github.com/open-telemetry/opentelemetry-cpp.git) registered for path 'third_party/opentelemetry-cpp' 2025-12-04T09:17:13.5100220Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2025-12-04T09:17:13.5106634Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2025-12-04T09:17:13.5112871Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2025-12-04T09:17:13.5119639Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2025-12-04T09:17:13.5129702Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2025-12-04T09:17:13.5136158Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2025-12-04T09:17:13.5142687Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2025-12-04T09:17:13.5149683Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2025-12-04T09:17:13.5190613Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/android/libs/fbjni'... 2025-12-04T09:17:13.7659847Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FXdiv'... 2025-12-04T09:17:13.7660638Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FP16'... 2025-12-04T09:17:13.7687157Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fmt'... 2025-12-04T09:17:16.8281418Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NNPACK'... 2025-12-04T09:17:16.8282652Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/benchmark'... 2025-12-04T09:17:16.8283769Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NVTX'... 2025-12-04T09:17:16.8284818Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gloo'... 2025-12-04T09:17:16.8285981Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2025-12-04T09:17:16.8287134Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpuinfo'... 2025-12-04T09:17:16.8288314Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flash-attention'... 2025-12-04T09:17:16.8289558Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpp-httplib'... 2025-12-04T09:17:16.8290724Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep'... 2025-12-04T09:17:16.8291822Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ittapi'... 2025-12-04T09:17:16.8292938Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kleidiai'... 2025-12-04T09:17:16.8294085Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pocketfft'... 2025-12-04T09:17:16.8295240Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2025-12-04T09:17:16.8296357Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/googletest'... 2025-12-04T09:17:16.8297549Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/psimd'... 2025-12-04T09:17:16.8298666Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/mimalloc'... 2025-12-04T09:17:16.8299781Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pthreadpool'... 2025-12-04T09:17:16.8300960Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flatbuffers'... 2025-12-04T09:17:16.8353443Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe'... 2025-12-04T09:17:17.4240080Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2025-12-04T09:17:17.4241252Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-peachpy'... 2025-12-04T09:17:17.4242352Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto'... 2025-12-04T09:17:17.5242656Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp'... 2025-12-04T09:17:31.2556093Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/sleef'... 2025-12-04T09:17:31.2557556Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pybind11'... 2025-12-04T09:17:31.2558661Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cutlass'... 2025-12-04T09:17:31.2559432Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm'... 2025-12-04T09:17:31.2560830Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx'... 2025-12-04T09:17:31.2561997Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/composable_kernel'... 2025-12-04T09:17:31.2563196Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nlohmann'... 2025-12-04T09:17:31.3558017Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/XNNPACK'... 2025-12-04T09:17:38.3150371Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/aiter'... 2025-12-04T09:17:38.3151278Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf'... 2025-12-04T09:17:38.3367605Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-12-04T09:17:38.3542482Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-12-04T09:17:38.3683782Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-12-04T09:17:38.4034209Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-12-04T09:17:38.5097871Z Submodule path 'third_party/NVTX': checked out '3ebbc93ded7285963bff932c678fa367eb393ba6' 2025-12-04T09:17:38.5700364Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-12-04T09:17:39.5591329Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-12-04T09:17:39.7863110Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-12-04T09:17:39.7890139Z Submodule '3rdparty/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T09:17:39.7924725Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/aiter/3rdparty/composable_kernel'... 2025-12-04T09:17:45.2872836Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-12-04T09:17:45.3209320Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-12-04T09:17:45.7965357Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T09:17:45.8583288Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-12-04T09:17:45.9784999Z Submodule path 'third_party/cpuinfo': checked out 'f858c30bcb16f8effd5ff46996f0514539e17abc' 2025-12-04T09:17:46.0393689Z Submodule path 'third_party/cudnn_frontend': checked out '0b1577c8c83401237d601d0d0db5210506705396' 2025-12-04T09:17:46.8857393Z Submodule path 'third_party/cutlass': checked out 'f88806b1e31dfa579842638740216dd41fc6c588' 2025-12-04T09:17:47.0897031Z Submodule path 'third_party/fbgemm': checked out 'c0b988d39a9e47c794d699f29930ed4d7c7e13a4' 2025-12-04T09:17:47.0926831Z Submodule 'external/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/external/asmjit' 2025-12-04T09:17:47.0932360Z Submodule 'external/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/fbgemm/external/composable_kernel' 2025-12-04T09:17:47.0936248Z Submodule 'external/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/external/cpuinfo' 2025-12-04T09:17:47.0940150Z Submodule 'external/cutlass' (https://github.com/jwfromm/cutlass) registered for path 'third_party/fbgemm/external/cutlass' 2025-12-04T09:17:47.0945090Z Submodule 'external/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/external/googletest' 2025-12-04T09:17:47.0949206Z Submodule 'external/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/external/hipify_torch' 2025-12-04T09:17:47.0953306Z Submodule 'external/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/fbgemm/external/json' 2025-12-04T09:17:47.0988010Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/asmjit'... 2025-12-04T09:17:48.3642131Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/hipify_torch'... 2025-12-04T09:17:48.3643022Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/cpuinfo'... 2025-12-04T09:17:48.3643852Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/googletest'... 2025-12-04T09:17:48.4643348Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/composable_kernel'... 2025-12-04T09:17:52.1362023Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/cutlass'... 2025-12-04T09:17:52.2363113Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/json'... 2025-12-04T09:17:54.8374133Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-12-04T09:17:55.3100411Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T09:17:55.4312732Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-12-04T09:17:56.2474619Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '98125ce499b0fdf7ffbe0e3052f5b8709f4840f8' 2025-12-04T09:17:56.3049810Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T09:17:56.3209333Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-12-04T09:17:56.5002702Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-12-04T09:17:56.5956030Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-12-04T09:17:56.5984219Z Submodule 'csrc/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T09:17:56.5986766Z Submodule 'csrc/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/flash-attention/csrc/cutlass' 2025-12-04T09:17:56.6022083Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flash-attention/csrc/composable_kernel'... 2025-12-04T09:18:01.4137816Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flash-attention/csrc/cutlass'... 2025-12-04T09:18:01.7478959Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-12-04T09:18:02.4713347Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-12-04T09:18:02.6589133Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-12-04T09:18:02.6966906Z Submodule path 'third_party/fmt': checked out '407c905e45ad75fc29bf0f9bb7c5c2fd3475976f' 2025-12-04T09:18:02.7452701Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-12-04T09:18:02.7813108Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341' 2025-12-04T09:18:02.8372057Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T09:18:02.8557734Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-12-04T09:18:02.8581859Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2025-12-04T09:18:02.8615811Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2025-12-04T09:18:18.4620424Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-12-04T09:18:18.4925105Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-12-04T09:18:18.5953833Z Submodule path 'third_party/kineto': checked out '31f85df8fbd89c188f14ef10f1ec65379786b943' 2025-12-04T09:18:18.5981915Z Submodule 'libkineto/third_party/dynolog' (https://github.com/facebookincubator/dynolog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T09:18:18.5984874Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T09:18:18.5989027Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T09:18:18.6024255Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog'... 2025-12-04T09:18:19.6147662Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2025-12-04T09:18:20.1859187Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2025-12-04T09:18:20.2994146Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1' 2025-12-04T09:18:20.3017126Z Submodule 'third_party/DCGM' (https://github.com/NVIDIA/DCGM.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T09:18:20.3020909Z Submodule 'third_party/cpr' (https://github.com/libcpr/cpr.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T09:18:20.3025281Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T09:18:20.3029668Z Submodule 'third_party/gflags' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T09:18:20.3033933Z Submodule 'third_party/glog' (https://github.com/google/glog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T09:18:20.3038834Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T09:18:20.3043238Z Submodule 'third_party/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T09:18:20.3047813Z Submodule 'third_party/pfs' (https://github.com/dtrugman/pfs.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T09:18:20.3052393Z Submodule 'third_party/prometheus-cpp' (https://github.com/jupp0r/prometheus-cpp.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T09:18:20.3088993Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'... 2025-12-04T09:18:22.1982238Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'... 2025-12-04T09:18:22.1983428Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'... 2025-12-04T09:18:22.1984823Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'... 2025-12-04T09:18:22.1985952Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'... 2025-12-04T09:18:22.1987173Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/glog'... 2025-12-04T09:18:22.1988274Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'... 2025-12-04T09:18:22.1989372Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'... 2025-12-04T09:18:22.2982612Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json'... 2025-12-04T09:18:28.9002846Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-12-04T09:18:28.9253593Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-12-04T09:18:28.9718765Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-12-04T09:18:28.9913867Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-12-04T09:18:28.9937477Z Submodule 'doc' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T09:18:28.9971836Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'... 2025-12-04T09:18:29.3095039Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-12-04T09:18:29.3348788Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-12-04T09:18:29.3914412Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T09:18:29.5212759Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-12-04T09:18:29.5449804Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-12-04T09:18:29.5763295Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a' 2025-12-04T09:18:29.5786934Z Submodule 'civetweb' (https://github.com/civetweb/civetweb.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:29.5790282Z Submodule 'googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:29.5829671Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'... 2025-12-04T09:18:32.0698986Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'... 2025-12-04T09:18:32.3636354Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159' 2025-12-04T09:18:32.4215679Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T09:18:32.4623460Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-12-04T09:18:32.5183194Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T09:18:32.5905844Z Submodule path 'third_party/kleidiai': checked out 'd7770c89632329a9914ef1a90289917597639cbe' 2025-12-04T09:18:32.6418883Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-12-04T09:18:32.7781399Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-12-04T09:18:33.4091199Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-12-04T09:18:33.4136707Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2025-12-04T09:18:33.4168831Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2025-12-04T09:18:34.3848799Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-12-04T09:18:34.4841833Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-12-04T09:18:34.4870351Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark) registered for path 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T09:18:34.4873702Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T09:18:34.4877941Z Submodule 'third_party/ms-gsl' (https://github.com/microsoft/GSL) registered for path 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T09:18:34.4882397Z Submodule 'third_party/nlohmann-json' (https://github.com/nlohmann/json) registered for path 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T09:18:34.4887021Z Submodule 'third_party/opentelemetry-proto' (https://github.com/open-telemetry/opentelemetry-proto) registered for path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T09:18:34.4891802Z Submodule 'third_party/opentracing-cpp' (https://github.com/opentracing/opentracing-cpp.git) registered for path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T09:18:34.4896324Z Submodule 'third_party/prometheus-cpp' (https://github.com/jupp0r/prometheus-cpp) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T09:18:34.4900714Z Submodule 'tools/vcpkg' (https://github.com/Microsoft/vcpkg) registered for path 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T09:18:34.4936028Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/benchmark'... 2025-12-04T09:18:34.9497634Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp'... 2025-12-04T09:18:34.9498718Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentelemetry-proto'... 2025-12-04T09:18:34.9500055Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/ms-gsl'... 2025-12-04T09:18:34.9502731Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp'... 2025-12-04T09:18:35.0498081Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/googletest'... 2025-12-04T09:18:35.7481321Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/nlohmann-json'... 2025-12-04T09:18:42.6807297Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/tools/vcpkg'... 2025-12-04T09:18:43.5336597Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-12-04T09:18:43.5850328Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-12-04T09:18:43.6078676Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-12-04T09:18:43.7435201Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-12-04T09:18:43.7624015Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-12-04T09:18:43.7836776Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-12-04T09:18:43.8066459Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-12-04T09:18:43.8091591Z Submodule 'civetweb' (https://github.com/civetweb/civetweb.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:43.8094477Z Submodule 'googletest' (https://github.com/google/googletest.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:43.8129021Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'... 2025-12-04T09:18:46.0720664Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'... 2025-12-04T09:18:46.3643370Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-12-04T09:18:46.4219504Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T09:18:47.1690975Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-12-04T09:18:47.1857077Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-12-04T09:18:47.5243481Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-12-04T09:18:47.5274803Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2025-12-04T09:18:47.5277676Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2025-12-04T09:18:47.5312547Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2025-12-04T09:18:48.1012189Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2025-12-04T09:18:48.5488460Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-12-04T09:18:48.6352576Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-12-04T09:18:48.6493869Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-12-04T09:18:48.6666656Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-12-04T09:18:48.7225050Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-12-04T09:18:48.7595309Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-12-04T09:18:48.8140229Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-12-04T09:18:48.8526271Z Submodule path 'third_party/tensorpipe': checked out '2b4cd91092d335a697416b2a3cb398283246849d' 2025-12-04T09:18:48.8551935Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2025-12-04T09:18:48.8554880Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2025-12-04T09:18:48.8558762Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2025-12-04T09:18:48.8563363Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T09:18:48.8597612Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2025-12-04T09:18:50.0178372Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2025-12-04T09:18:50.0179392Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2025-12-04T09:18:50.0986306Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2025-12-04T09:18:50.1676162Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-12-04T09:18:50.1896826Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-12-04T09:18:50.2806324Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-12-04T09:18:50.3188524Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-12-04T09:18:50.3212378Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T09:18:50.3247905Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2025-12-04T09:18:50.5223267Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-12-04T09:18:50.5276667Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-12-04T09:18:50.5675548Z Entering 'android/libs/fbjni' 2025-12-04T09:18:50.5735986Z Entering 'third_party/FP16' 2025-12-04T09:18:50.5795143Z Entering 'third_party/FXdiv' 2025-12-04T09:18:50.5853841Z Entering 'third_party/NNPACK' 2025-12-04T09:18:50.5913111Z Entering 'third_party/NVTX' 2025-12-04T09:18:50.5973330Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T09:18:50.6035813Z Entering 'third_party/XNNPACK' 2025-12-04T09:18:50.6109307Z Entering 'third_party/aiter' 2025-12-04T09:18:50.6169412Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T09:18:50.6238298Z Entering 'third_party/benchmark' 2025-12-04T09:18:50.6298042Z Entering 'third_party/composable_kernel' 2025-12-04T09:18:50.6371824Z Entering 'third_party/cpp-httplib' 2025-12-04T09:18:50.6433751Z Entering 'third_party/cpuinfo' 2025-12-04T09:18:50.6492705Z Entering 'third_party/cudnn_frontend' 2025-12-04T09:18:50.6553302Z Entering 'third_party/cutlass' 2025-12-04T09:18:50.6621606Z Entering 'third_party/fbgemm' 2025-12-04T09:18:50.6681968Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T09:18:50.6738863Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T09:18:50.6805351Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T09:18:50.6863619Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T09:18:50.6930135Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T09:18:50.6987510Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T09:18:50.7042703Z Entering 'third_party/fbgemm/external/json' 2025-12-04T09:18:50.7104326Z Entering 'third_party/flash-attention' 2025-12-04T09:18:50.7163368Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T09:18:50.7227305Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T09:18:50.7294970Z Entering 'third_party/flatbuffers' 2025-12-04T09:18:50.7357248Z Entering 'third_party/fmt' 2025-12-04T09:18:50.7415292Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T09:18:50.7474726Z Entering 'third_party/gloo' 2025-12-04T09:18:50.7533865Z Entering 'third_party/googletest' 2025-12-04T09:18:50.7592070Z Entering 'third_party/ideep' 2025-12-04T09:18:50.7650908Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T09:18:50.7718048Z Entering 'third_party/ittapi' 2025-12-04T09:18:50.7778210Z Entering 'third_party/kineto' 2025-12-04T09:18:50.7839386Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T09:18:50.7895239Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T09:18:50.7960970Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T09:18:50.8022301Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T09:18:50.8079552Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T09:18:50.8135613Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T09:18:50.8198525Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T09:18:50.8258934Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T09:18:50.8315650Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T09:18:50.8374923Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T09:18:50.8432734Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T09:18:50.8489286Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:50.8552134Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:50.8621206Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T09:18:50.8678566Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T09:18:50.8741682Z Entering 'third_party/kleidiai' 2025-12-04T09:18:50.8801023Z Entering 'third_party/mimalloc' 2025-12-04T09:18:50.8859390Z Entering 'third_party/nlohmann' 2025-12-04T09:18:50.8919066Z Entering 'third_party/onnx' 2025-12-04T09:18:50.8996261Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T09:18:50.9060300Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T09:18:50.9120909Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T09:18:50.9182162Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T09:18:50.9244838Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T09:18:50.9300834Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T09:18:50.9359966Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T09:18:50.9420619Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T09:18:50.9481214Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T09:18:50.9536726Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:50.9596585Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:50.9657873Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T09:18:50.9747679Z Entering 'third_party/pocketfft' 2025-12-04T09:18:50.9807268Z Entering 'third_party/protobuf' 2025-12-04T09:18:50.9870398Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T09:18:50.9927780Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T09:18:50.9988983Z Entering 'third_party/psimd' 2025-12-04T09:18:51.0049439Z Entering 'third_party/pthreadpool' 2025-12-04T09:18:51.0107690Z Entering 'third_party/pybind11' 2025-12-04T09:18:51.0167136Z Entering 'third_party/python-peachpy' 2025-12-04T09:18:51.0226266Z Entering 'third_party/sleef' 2025-12-04T09:18:51.0285610Z Entering 'third_party/tensorpipe' 2025-12-04T09:18:51.0342741Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T09:18:51.0399799Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T09:18:51.0457021Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T09:18:51.0516222Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T09:18:51.0570665Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T09:18:51.0653067Z ##[endgroup] 2025-12-04T09:18:51.0661409Z ##[group]Persisting credentials for submodules 2025-12-04T09:18:51.0662442Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-12-04T09:18:51.1056775Z Entering 'android/libs/fbjni' 2025-12-04T09:18:51.1135557Z Entering 'third_party/FP16' 2025-12-04T09:18:51.1216129Z Entering 'third_party/FXdiv' 2025-12-04T09:18:51.1297692Z Entering 'third_party/NNPACK' 2025-12-04T09:18:51.1378094Z Entering 'third_party/NVTX' 2025-12-04T09:18:51.1458571Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T09:18:51.1539572Z Entering 'third_party/XNNPACK' 2025-12-04T09:18:51.1635411Z Entering 'third_party/aiter' 2025-12-04T09:18:51.1714639Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T09:18:51.1801472Z Entering 'third_party/benchmark' 2025-12-04T09:18:51.1881829Z Entering 'third_party/composable_kernel' 2025-12-04T09:18:51.1969497Z Entering 'third_party/cpp-httplib' 2025-12-04T09:18:51.2050669Z Entering 'third_party/cpuinfo' 2025-12-04T09:18:51.2131455Z Entering 'third_party/cudnn_frontend' 2025-12-04T09:18:51.2210368Z Entering 'third_party/cutlass' 2025-12-04T09:18:51.2304166Z Entering 'third_party/fbgemm' 2025-12-04T09:18:51.2384439Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T09:18:51.2461986Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T09:18:51.2547951Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T09:18:51.2625437Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T09:18:51.2711344Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T09:18:51.2787744Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T09:18:51.2863767Z Entering 'third_party/fbgemm/external/json' 2025-12-04T09:18:51.2944955Z Entering 'third_party/flash-attention' 2025-12-04T09:18:51.3025380Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T09:18:51.3112110Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T09:18:51.3202088Z Entering 'third_party/flatbuffers' 2025-12-04T09:18:51.3286400Z Entering 'third_party/fmt' 2025-12-04T09:18:51.3367034Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T09:18:51.3447204Z Entering 'third_party/gloo' 2025-12-04T09:18:51.3531891Z Entering 'third_party/googletest' 2025-12-04T09:18:51.3615554Z Entering 'third_party/ideep' 2025-12-04T09:18:51.3693525Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T09:18:51.3784042Z Entering 'third_party/ittapi' 2025-12-04T09:18:51.3863900Z Entering 'third_party/kineto' 2025-12-04T09:18:51.3941969Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T09:18:51.4023374Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T09:18:51.4101599Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T09:18:51.4182115Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T09:18:51.4260450Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T09:18:51.4339201Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T09:18:51.4419859Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T09:18:51.4502701Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T09:18:51.4580764Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T09:18:51.4659739Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T09:18:51.4736329Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T09:18:51.4815010Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:51.4901918Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:51.4991415Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T09:18:51.5070582Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T09:18:51.5150707Z Entering 'third_party/kleidiai' 2025-12-04T09:18:51.5232107Z Entering 'third_party/mimalloc' 2025-12-04T09:18:51.5311007Z Entering 'third_party/nlohmann' 2025-12-04T09:18:51.5392337Z Entering 'third_party/onnx' 2025-12-04T09:18:51.5486842Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T09:18:51.5570613Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T09:18:51.5651642Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T09:18:51.5728707Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T09:18:51.5807843Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T09:18:51.5886854Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T09:18:51.5963140Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T09:18:51.6039508Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T09:18:51.6115168Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T09:18:51.6189453Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:51.6269243Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:51.6354373Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T09:18:51.6453302Z Entering 'third_party/pocketfft' 2025-12-04T09:18:51.6534203Z Entering 'third_party/protobuf' 2025-12-04T09:18:51.6614274Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T09:18:51.6691086Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T09:18:51.6773094Z Entering 'third_party/psimd' 2025-12-04T09:18:51.6853131Z Entering 'third_party/pthreadpool' 2025-12-04T09:18:51.6932606Z Entering 'third_party/pybind11' 2025-12-04T09:18:51.7017829Z Entering 'third_party/python-peachpy' 2025-12-04T09:18:51.7097164Z Entering 'third_party/sleef' 2025-12-04T09:18:51.7176468Z Entering 'third_party/tensorpipe' 2025-12-04T09:18:51.7255539Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T09:18:51.7335524Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T09:18:51.7412079Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T09:18:51.7487986Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T09:18:51.7561350Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T09:18:51.7665806Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-12-04T09:18:51.8059751Z Entering 'android/libs/fbjni' 2025-12-04T09:18:51.8140037Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T09:18:51.8166243Z Entering 'third_party/FP16' 2025-12-04T09:18:51.8239555Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T09:18:51.8263202Z Entering 'third_party/FXdiv' 2025-12-04T09:18:51.8334240Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T09:18:51.8359046Z Entering 'third_party/NNPACK' 2025-12-04T09:18:51.8432318Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T09:18:51.8457307Z Entering 'third_party/NVTX' 2025-12-04T09:18:51.8529083Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T09:18:51.8554655Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T09:18:51.8625863Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T09:18:51.8651474Z Entering 'third_party/XNNPACK' 2025-12-04T09:18:51.8723197Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T09:18:51.8763236Z Entering 'third_party/aiter' 2025-12-04T09:18:51.8834859Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T09:18:51.8859210Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T09:18:51.8935705Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T09:18:51.8971331Z Entering 'third_party/benchmark' 2025-12-04T09:18:51.9046665Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T09:18:51.9071230Z Entering 'third_party/composable_kernel' 2025-12-04T09:18:51.9141638Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T09:18:51.9175623Z Entering 'third_party/cpp-httplib' 2025-12-04T09:18:51.9246895Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T09:18:51.9272340Z Entering 'third_party/cpuinfo' 2025-12-04T09:18:51.9344747Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T09:18:51.9370956Z Entering 'third_party/cudnn_frontend' 2025-12-04T09:18:51.9440917Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T09:18:51.9465895Z Entering 'third_party/cutlass' 2025-12-04T09:18:51.9537029Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T09:18:51.9571997Z Entering 'third_party/fbgemm' 2025-12-04T09:18:51.9645721Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T09:18:51.9674316Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T09:18:51.9745586Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T09:18:51.9769775Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T09:18:51.9843829Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T09:18:51.9876596Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T09:18:51.9947000Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T09:18:51.9971122Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T09:18:52.0042120Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T09:18:52.0075936Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T09:18:52.0145582Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T09:18:52.0169442Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T09:18:52.0239977Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T09:18:52.0263494Z Entering 'third_party/fbgemm/external/json' 2025-12-04T09:18:52.0339761Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T09:18:52.0369153Z Entering 'third_party/flash-attention' 2025-12-04T09:18:52.0443568Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T09:18:52.0467529Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T09:18:52.0537834Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T09:18:52.0568138Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T09:18:52.0638556Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T09:18:52.0674613Z Entering 'third_party/flatbuffers' 2025-12-04T09:18:52.0748923Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T09:18:52.0778822Z Entering 'third_party/fmt' 2025-12-04T09:18:52.0852090Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T09:18:52.0876830Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T09:18:52.0950201Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T09:18:52.0976034Z Entering 'third_party/gloo' 2025-12-04T09:18:52.1050484Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T09:18:52.1075772Z Entering 'third_party/googletest' 2025-12-04T09:18:52.1147398Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T09:18:52.1172556Z Entering 'third_party/ideep' 2025-12-04T09:18:52.1246368Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T09:18:52.1270563Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T09:18:52.1342827Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T09:18:52.1377993Z Entering 'third_party/ittapi' 2025-12-04T09:18:52.1449690Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T09:18:52.1475056Z Entering 'third_party/kineto' 2025-12-04T09:18:52.1547644Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T09:18:52.1571119Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T09:18:52.1649425Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T09:18:52.1671943Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T09:18:52.1744387Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T09:18:52.1770548Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T09:18:52.1845835Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T09:18:52.1870759Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T09:18:52.1942796Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T09:18:52.1966271Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T09:18:52.2038873Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T09:18:52.2060703Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T09:18:52.2138292Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T09:18:52.2166391Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T09:18:52.2238043Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T09:18:52.2262283Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T09:18:52.2335091Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T09:18:52.2359974Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T09:18:52.2430244Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T09:18:52.2457057Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T09:18:52.2529679Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T09:18:52.2553191Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T09:18:52.2625214Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T09:18:52.2646982Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:52.2719502Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T09:18:52.2747013Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:52.2820623Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T09:18:52.2855139Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T09:18:52.2926289Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T09:18:52.2950323Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T09:18:52.3021008Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T09:18:52.3050290Z Entering 'third_party/kleidiai' 2025-12-04T09:18:52.3122318Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T09:18:52.3149081Z Entering 'third_party/mimalloc' 2025-12-04T09:18:52.3221724Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T09:18:52.3248214Z Entering 'third_party/nlohmann' 2025-12-04T09:18:52.3320359Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T09:18:52.3348139Z Entering 'third_party/onnx' 2025-12-04T09:18:52.3419884Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T09:18:52.3461206Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T09:18:52.3534088Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T09:18:52.3564902Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T09:18:52.3637939Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T09:18:52.3662179Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T09:18:52.3739250Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T09:18:52.3763606Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T09:18:52.3834876Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T09:18:52.3859213Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T09:18:52.3931815Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T09:18:52.3957556Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T09:18:52.4029496Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T09:18:52.4054853Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T09:18:52.4130224Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T09:18:52.4154753Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T09:18:52.4223955Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T09:18:52.4248446Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T09:18:52.4316923Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T09:18:52.4339437Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:52.4413373Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T09:18:52.4440282Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:52.4512574Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T09:18:52.4540506Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T09:18:52.4613173Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T09:18:52.4661908Z Entering 'third_party/pocketfft' 2025-12-04T09:18:52.4739013Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T09:18:52.4763959Z Entering 'third_party/protobuf' 2025-12-04T09:18:52.4833778Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T09:18:52.4861398Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T09:18:52.4933914Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T09:18:52.4956655Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T09:18:52.5027205Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T09:18:52.5056957Z Entering 'third_party/psimd' 2025-12-04T09:18:52.5129804Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T09:18:52.5155669Z Entering 'third_party/pthreadpool' 2025-12-04T09:18:52.5228337Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T09:18:52.5252155Z Entering 'third_party/pybind11' 2025-12-04T09:18:52.5323709Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T09:18:52.5349098Z Entering 'third_party/python-peachpy' 2025-12-04T09:18:52.5422754Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T09:18:52.5449210Z Entering 'third_party/sleef' 2025-12-04T09:18:52.5521367Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T09:18:52.5547307Z Entering 'third_party/tensorpipe' 2025-12-04T09:18:52.5619085Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T09:18:52.5642441Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T09:18:52.5712457Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T09:18:52.5735994Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T09:18:52.5810373Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T09:18:52.5833799Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T09:18:52.5904797Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T09:18:52.5930281Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T09:18:52.6003907Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T09:18:52.6027046Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T09:18:52.6102749Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T09:18:52.7428104Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-12-04T09:18:52.7826020Z Entering 'android/libs/fbjni' 2025-12-04T09:18:52.7891563Z Entering 'third_party/FP16' 2025-12-04T09:18:52.7964537Z Entering 'third_party/FXdiv' 2025-12-04T09:18:52.8022356Z Entering 'third_party/NNPACK' 2025-12-04T09:18:52.8088933Z Entering 'third_party/NVTX' 2025-12-04T09:18:52.8152490Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T09:18:52.8212556Z Entering 'third_party/XNNPACK' 2025-12-04T09:18:52.8299566Z Entering 'third_party/aiter' 2025-12-04T09:18:52.8361545Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T09:18:52.8438521Z Entering 'third_party/benchmark' 2025-12-04T09:18:52.8497755Z Entering 'third_party/composable_kernel' 2025-12-04T09:18:52.8565714Z Entering 'third_party/cpp-httplib' 2025-12-04T09:18:52.8624103Z Entering 'third_party/cpuinfo' 2025-12-04T09:18:52.8692026Z Entering 'third_party/cudnn_frontend' 2025-12-04T09:18:52.8758232Z Entering 'third_party/cutlass' 2025-12-04T09:18:52.8835305Z Entering 'third_party/fbgemm' 2025-12-04T09:18:52.8895652Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T09:18:52.8958691Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T09:18:52.9031899Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T09:18:52.9089454Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T09:18:52.9155825Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T09:18:52.9216977Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T09:18:52.9284614Z Entering 'third_party/fbgemm/external/json' 2025-12-04T09:18:52.9352385Z Entering 'third_party/flash-attention' 2025-12-04T09:18:52.9419366Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T09:18:52.9484054Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T09:18:52.9559105Z Entering 'third_party/flatbuffers' 2025-12-04T09:18:52.9621317Z Entering 'third_party/fmt' 2025-12-04T09:18:52.9681930Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T09:18:52.9744721Z Entering 'third_party/gloo' 2025-12-04T09:18:52.9802776Z Entering 'third_party/googletest' 2025-12-04T09:18:52.9871332Z Entering 'third_party/ideep' 2025-12-04T09:18:52.9931388Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T09:18:53.0000765Z Entering 'third_party/ittapi' 2025-12-04T09:18:53.0065126Z Entering 'third_party/kineto' 2025-12-04T09:18:53.0134855Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T09:18:53.0190605Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T09:18:53.0252355Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T09:18:53.0310049Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T09:18:53.0374166Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T09:18:53.0432556Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T09:18:53.0498840Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T09:18:53.0558886Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T09:18:53.0617890Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T09:18:53.0679154Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T09:18:53.0745687Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T09:18:53.0802688Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:53.0865130Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:53.0939351Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T09:18:53.0996261Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T09:18:53.1058132Z Entering 'third_party/kleidiai' 2025-12-04T09:18:53.1118593Z Entering 'third_party/mimalloc' 2025-12-04T09:18:53.1191253Z Entering 'third_party/nlohmann' 2025-12-04T09:18:53.1248610Z Entering 'third_party/onnx' 2025-12-04T09:18:53.1332979Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T09:18:53.1401389Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T09:18:53.1469732Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T09:18:53.1527486Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T09:18:53.1585846Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T09:18:53.1640337Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T09:18:53.1699323Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T09:18:53.1756700Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T09:18:53.1813995Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T09:18:53.1869824Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:53.1931176Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:53.1999489Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T09:18:53.2081016Z Entering 'third_party/pocketfft' 2025-12-04T09:18:53.2141597Z Entering 'third_party/protobuf' 2025-12-04T09:18:53.2204022Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T09:18:53.2262809Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T09:18:53.2323260Z Entering 'third_party/psimd' 2025-12-04T09:18:53.2384920Z Entering 'third_party/pthreadpool' 2025-12-04T09:18:53.2445465Z Entering 'third_party/pybind11' 2025-12-04T09:18:53.2505966Z Entering 'third_party/python-peachpy' 2025-12-04T09:18:53.2569019Z Entering 'third_party/sleef' 2025-12-04T09:18:53.2628952Z Entering 'third_party/tensorpipe' 2025-12-04T09:18:53.2689773Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T09:18:53.2754759Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T09:18:53.2811351Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T09:18:53.2868243Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T09:18:53.2924630Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T09:18:53.3012600Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-12-04T09:18:53.3408370Z Entering 'android/libs/fbjni' 2025-12-04T09:18:53.3468796Z Entering 'third_party/FP16' 2025-12-04T09:18:53.3528159Z Entering 'third_party/FXdiv' 2025-12-04T09:18:53.3587837Z Entering 'third_party/NNPACK' 2025-12-04T09:18:53.3646875Z Entering 'third_party/NVTX' 2025-12-04T09:18:53.3706697Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T09:18:53.3767689Z Entering 'third_party/XNNPACK' 2025-12-04T09:18:53.3850953Z Entering 'third_party/aiter' 2025-12-04T09:18:53.3914025Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T09:18:53.3984122Z Entering 'third_party/benchmark' 2025-12-04T09:18:53.4044682Z Entering 'third_party/composable_kernel' 2025-12-04T09:18:53.4113027Z Entering 'third_party/cpp-httplib' 2025-12-04T09:18:53.4172432Z Entering 'third_party/cpuinfo' 2025-12-04T09:18:53.4232242Z Entering 'third_party/cudnn_frontend' 2025-12-04T09:18:53.4291098Z Entering 'third_party/cutlass' 2025-12-04T09:18:53.4361679Z Entering 'third_party/fbgemm' 2025-12-04T09:18:53.4424885Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T09:18:53.4482554Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T09:18:53.4549738Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T09:18:53.4607077Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T09:18:53.4672810Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T09:18:53.4730421Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T09:18:53.4786241Z Entering 'third_party/fbgemm/external/json' 2025-12-04T09:18:53.4852018Z Entering 'third_party/flash-attention' 2025-12-04T09:18:53.4911243Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T09:18:53.4975978Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T09:18:53.5044827Z Entering 'third_party/flatbuffers' 2025-12-04T09:18:53.5107585Z Entering 'third_party/fmt' 2025-12-04T09:18:53.5172789Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T09:18:53.5232213Z Entering 'third_party/gloo' 2025-12-04T09:18:53.5291517Z Entering 'third_party/googletest' 2025-12-04T09:18:53.5349590Z Entering 'third_party/ideep' 2025-12-04T09:18:53.5408471Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T09:18:53.5477874Z Entering 'third_party/ittapi' 2025-12-04T09:18:53.5537605Z Entering 'third_party/kineto' 2025-12-04T09:18:53.5595293Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T09:18:53.5653501Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T09:18:53.5711990Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T09:18:53.5773286Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T09:18:53.5830752Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T09:18:53.5887105Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T09:18:53.5951893Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T09:18:53.6010443Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T09:18:53.6069978Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T09:18:53.6130918Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T09:18:53.6188847Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T09:18:53.6248595Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:53.6309755Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:53.6378612Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T09:18:53.6434876Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T09:18:53.6497746Z Entering 'third_party/kleidiai' 2025-12-04T09:18:53.6559240Z Entering 'third_party/mimalloc' 2025-12-04T09:18:53.6619619Z Entering 'third_party/nlohmann' 2025-12-04T09:18:53.6680469Z Entering 'third_party/onnx' 2025-12-04T09:18:53.6755693Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T09:18:53.6820382Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T09:18:53.6884829Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T09:18:53.6942870Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T09:18:53.7001021Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T09:18:53.7058137Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T09:18:53.7118625Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T09:18:53.7175344Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T09:18:53.7233055Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T09:18:53.7287398Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:53.7353181Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:53.7415370Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T09:18:53.7499962Z Entering 'third_party/pocketfft' 2025-12-04T09:18:53.7560390Z Entering 'third_party/protobuf' 2025-12-04T09:18:53.7623280Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T09:18:53.7684748Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T09:18:53.7747182Z Entering 'third_party/psimd' 2025-12-04T09:18:53.7806350Z Entering 'third_party/pthreadpool' 2025-12-04T09:18:53.7871804Z Entering 'third_party/pybind11' 2025-12-04T09:18:53.7933631Z Entering 'third_party/python-peachpy' 2025-12-04T09:18:53.7996408Z Entering 'third_party/sleef' 2025-12-04T09:18:53.8056276Z Entering 'third_party/tensorpipe' 2025-12-04T09:18:53.8114363Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T09:18:53.8173909Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T09:18:53.8230615Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T09:18:53.8287359Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T09:18:53.8341497Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T09:18:53.8425610Z ##[endgroup] 2025-12-04T09:18:53.8472541Z [command]/usr/bin/git log -1 --format=%H 2025-12-04T09:18:53.8500770Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:53.8631420Z ##[group]Run cd "${GITHUB_WORKSPACE}" 2025-12-04T09:18:53.8631774Z cd "${GITHUB_WORKSPACE}" 2025-12-04T09:18:53.8632081Z # Clean stale submodule dirs 2025-12-04T09:18:53.8632407Z if [ -z "${NO_SUDO}" ]; then 2025-12-04T09:18:53.8632799Z  sudo git submodule foreach --recursive git clean -ffdx 2025-12-04T09:18:53.8633176Z else 2025-12-04T09:18:53.8633489Z  git submodule foreach --recursive git clean -ffdx 2025-12-04T09:18:53.8633849Z fi 2025-12-04T09:18:53.8645810Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:53.8646175Z env: 2025-12-04T09:18:53.8646398Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:53.8646673Z NO_SUDO: true 2025-12-04T09:18:53.8646889Z ##[endgroup] 2025-12-04T09:18:53.9069318Z Entering 'android/libs/fbjni' 2025-12-04T09:18:53.9116417Z Entering 'third_party/FP16' 2025-12-04T09:18:53.9162240Z Entering 'third_party/FXdiv' 2025-12-04T09:18:53.9207054Z Entering 'third_party/NNPACK' 2025-12-04T09:18:53.9256760Z Entering 'third_party/NVTX' 2025-12-04T09:18:53.9310797Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T09:18:53.9357384Z Entering 'third_party/XNNPACK' 2025-12-04T09:18:53.9512553Z Entering 'third_party/aiter' 2025-12-04T09:18:53.9572448Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T09:18:53.9722512Z Entering 'third_party/benchmark' 2025-12-04T09:18:53.9770315Z Entering 'third_party/composable_kernel' 2025-12-04T09:18:53.9929588Z Entering 'third_party/cpp-httplib' 2025-12-04T09:18:53.9978159Z Entering 'third_party/cpuinfo' 2025-12-04T09:18:54.0029896Z Entering 'third_party/cudnn_frontend' 2025-12-04T09:18:54.0079601Z Entering 'third_party/cutlass' 2025-12-04T09:18:54.0215628Z Entering 'third_party/fbgemm' 2025-12-04T09:18:54.0303201Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T09:18:54.0349710Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T09:18:54.0506877Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T09:18:54.0556385Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T09:18:54.0687181Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T09:18:54.0735212Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T09:18:54.0778437Z Entering 'third_party/fbgemm/external/json' 2025-12-04T09:18:54.0846055Z Entering 'third_party/flash-attention' 2025-12-04T09:18:54.0901213Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T09:18:54.1035426Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T09:18:54.1154371Z Entering 'third_party/flatbuffers' 2025-12-04T09:18:54.1254032Z Entering 'third_party/fmt' 2025-12-04T09:18:54.1301548Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T09:18:54.1349691Z Entering 'third_party/gloo' 2025-12-04T09:18:54.1397465Z Entering 'third_party/googletest' 2025-12-04T09:18:54.1449544Z Entering 'third_party/ideep' 2025-12-04T09:18:54.1491905Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T09:18:54.1608965Z Entering 'third_party/ittapi' 2025-12-04T09:18:54.1659093Z Entering 'third_party/kineto' 2025-12-04T09:18:54.1707863Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T09:18:54.1759012Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T09:18:54.1822104Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T09:18:54.1868087Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T09:18:54.1921845Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T09:18:54.1961831Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T09:18:54.2008077Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T09:18:54.2051317Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T09:18:54.2098580Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T09:18:54.2156348Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T09:18:54.2199054Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T09:18:54.2245482Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:54.2312479Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:54.2370943Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T09:18:54.2415782Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T09:18:54.2466983Z Entering 'third_party/kleidiai' 2025-12-04T09:18:54.2522374Z Entering 'third_party/mimalloc' 2025-12-04T09:18:54.2570573Z Entering 'third_party/nlohmann' 2025-12-04T09:18:54.2634262Z Entering 'third_party/onnx' 2025-12-04T09:18:54.3102663Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T09:18:54.3157299Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T09:18:54.3234794Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T09:18:54.3278891Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T09:18:54.3325457Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T09:18:54.3367288Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T09:18:54.3425329Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T09:18:54.3470300Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T09:18:54.3514703Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T09:18:54.3557831Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T09:18:54.3623255Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T09:18:54.3674337Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T09:18:54.4036472Z Entering 'third_party/pocketfft' 2025-12-04T09:18:54.4079019Z Entering 'third_party/protobuf' 2025-12-04T09:18:54.4186462Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T09:18:54.4234960Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T09:18:54.4289938Z Entering 'third_party/psimd' 2025-12-04T09:18:54.4333180Z Entering 'third_party/pthreadpool' 2025-12-04T09:18:54.4376877Z Entering 'third_party/pybind11' 2025-12-04T09:18:54.4426908Z Entering 'third_party/python-peachpy' 2025-12-04T09:18:54.4473409Z Entering 'third_party/sleef' 2025-12-04T09:18:54.4521732Z Entering 'third_party/tensorpipe' 2025-12-04T09:18:54.4570550Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T09:18:54.4619273Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T09:18:54.4667593Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T09:18:54.4721447Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T09:18:54.4763671Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T09:18:54.4945324Z Prepare all required actions 2025-12-04T09:18:54.4945826Z Getting action download info 2025-12-04T09:18:54.6571517Z ##[group]Run ./.github/actions/setup-linux 2025-12-04T09:18:54.6571823Z env: 2025-12-04T09:18:54.6572044Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:54.6572313Z ##[endgroup] 2025-12-04T09:18:54.6609182Z ##[group]Run set -euo pipefail 2025-12-04T09:18:54.6609513Z set -euo pipefail 2025-12-04T09:18:54.6609811Z function get_ec2_metadata() { 2025-12-04T09:18:54.6610193Z  # Pulled from instance metadata endpoint for EC2 2025-12-04T09:18:54.6610806Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2025-12-04T09:18:54.6611352Z  category=$1 2025-12-04T09:18:54.6611902Z  # If it is GCP runner (runner name contains gcp), do not run this 2025-12-04T09:18:54.6612328Z  runner_name_str=i-04e943c8c4e7268ee 2025-12-04T09:18:54.6612697Z  if [[ -f /.inarc ]]; then 2025-12-04T09:18:54.6613040Z  echo "ARC Runner, no info on ec2 metadata" 2025-12-04T09:18:54.6613428Z  elif [[ $runner_name_str == *"gcp"* ]]; then 2025-12-04T09:18:54.6613894Z  echo "Runner is from Google Cloud Platform, No info on ec2 metadata" 2025-12-04T09:18:54.6614309Z  else 2025-12-04T09:18:54.6615117Z  curl -H "X-aws-ec2-metadata-token: $(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 30")" -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2025-12-04T09:18:54.6615972Z  fi 2025-12-04T09:18:54.6616191Z } 2025-12-04T09:18:54.6616462Z echo "ami-id: $(get_ec2_metadata ami-id)" 2025-12-04T09:18:54.6616887Z echo "instance-id: $(get_ec2_metadata instance-id)" 2025-12-04T09:18:54.6617363Z echo "instance-type: $(get_ec2_metadata instance-type)" 2025-12-04T09:18:54.6617776Z echo "system info $(uname -a)" 2025-12-04T09:18:54.6627275Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:54.6627643Z env: 2025-12-04T09:18:54.6627862Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:54.6628112Z ##[endgroup] 2025-12-04T09:18:54.6804930Z ami-id: ami-08982f1c5bf93d976 2025-12-04T09:18:54.6926464Z instance-id: i-04e943c8c4e7268ee 2025-12-04T09:18:54.7044415Z instance-type: g5.4xlarge 2025-12-04T09:18:54.7059047Z system info Linux ip-10-0-73-227.ec2.internal 6.1.150-174.273.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Sep 9 12:21:26 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux 2025-12-04T09:18:54.7082834Z ##[group]Run if [ -f /usr/bin/nvidia-smi ]; then nvidia-smi; fi 2025-12-04T09:18:54.7083305Z if [ -f /usr/bin/nvidia-smi ]; then nvidia-smi; fi 2025-12-04T09:18:54.7092432Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:54.7092793Z env: 2025-12-04T09:18:54.7093002Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:54.7093263Z ##[endgroup] 2025-12-04T09:18:56.3151208Z Thu Dec 4 09:18:56 2025 2025-12-04T09:18:56.3151772Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:18:56.3152468Z | NVIDIA-SMI 580.82.07 Driver Version: 580.82.07 CUDA Version: 13.0 | 2025-12-04T09:18:56.3152977Z +-----------------------------------------+------------------------+----------------------+ 2025-12-04T09:18:56.3153487Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2025-12-04T09:18:56.3154048Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2025-12-04T09:18:56.3154507Z | | | MIG M. | 2025-12-04T09:18:56.3154872Z |=========================================+========================+======================| 2025-12-04T09:18:56.3251504Z | 0 NVIDIA A10G Off | 00000000:00:1E.0 Off | 0 | 2025-12-04T09:18:56.3253287Z | 0% 26C P0 52W / 300W | 0MiB / 23028MiB | 3% Default | 2025-12-04T09:18:56.3254213Z | | | N/A | 2025-12-04T09:18:56.3255284Z +-----------------------------------------+------------------------+----------------------+ 2025-12-04T09:18:56.3256129Z 2025-12-04T09:18:56.3256631Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:18:56.3257848Z | Processes: | 2025-12-04T09:18:56.3259034Z | GPU GI CI PID Type Process name GPU Memory | 2025-12-04T09:18:56.3259632Z | ID ID Usage | 2025-12-04T09:18:56.3260180Z |=========================================================================================| 2025-12-04T09:18:56.3260625Z | No running processes found | 2025-12-04T09:18:56.3261120Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:18:56.7735301Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T09:18:56.7736192Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T09:18:56.7746941Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:56.7747305Z env: 2025-12-04T09:18:56.7747512Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:56.7747776Z ##[endgroup] 2025-12-04T09:18:56.7809532Z ##[group]Run if systemctl is-active --quiet docker; then 2025-12-04T09:18:56.7809975Z if systemctl is-active --quiet docker; then 2025-12-04T09:18:56.7810357Z  echo "Docker daemon is running..."; 2025-12-04T09:18:56.7810979Z else 2025-12-04T09:18:56.7811336Z  echo "Starting docker daemon..." && sudo systemctl start docker; 2025-12-04T09:18:56.7811737Z fi 2025-12-04T09:18:56.7820671Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:56.7821037Z env: 2025-12-04T09:18:56.7821248Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:56.7821506Z ##[endgroup] 2025-12-04T09:18:56.7919009Z Docker daemon is running... 2025-12-04T09:18:56.7960648Z ##[group]Run nick-fields/retry@v3.0.0 2025-12-04T09:18:56.7960950Z with: 2025-12-04T09:18:56.7961155Z shell: bash 2025-12-04T09:18:56.7961381Z timeout_minutes: 5 2025-12-04T09:18:56.7961631Z max_attempts: 3 2025-12-04T09:18:56.7961863Z retry_wait_seconds: 30 2025-12-04T09:18:56.7964001Z command: AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" # For LF Runners we need to make sure we also login to Meta's ECR docker registry too. META_AWS_ACCOUNT_ID=308535385114 if [ "$AWS_ACCOUNT_ID" != "$META_AWS_ACCOUNT_ID" ] ; then aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$META_AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" fi 2025-12-04T09:18:56.7966130Z polling_interval_seconds: 1 2025-12-04T09:18:56.7966411Z warning_on_retry: true 2025-12-04T09:18:56.7966672Z continue_on_error: false 2025-12-04T09:18:56.7966915Z env: 2025-12-04T09:18:56.7967131Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:56.7967393Z AWS_RETRY_MODE: standard 2025-12-04T09:18:56.7967645Z AWS_MAX_ATTEMPTS: 5 2025-12-04T09:18:56.7967902Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:56.7968179Z ##[endgroup] 2025-12-04T09:18:57.9531569Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-12-04T09:18:57.9532451Z Configure a credential helper to remove this warning. See 2025-12-04T09:18:57.9533028Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-12-04T09:18:57.9533402Z 2025-12-04T09:18:57.9533497Z Login Succeeded 2025-12-04T09:18:58.8841493Z Command completed after 1 attempt(s). 2025-12-04T09:18:58.8912123Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T09:18:58.8912625Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T09:18:58.8913079Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T09:18:58.8925252Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:58.8925612Z env: 2025-12-04T09:18:58.8925834Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:58.8926095Z ##[endgroup] 2025-12-04T09:18:58.9025501Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T09:18:58.9026084Z # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T09:18:58.9026684Z # shellcheck disable=SC2046 2025-12-04T09:18:58.9027041Z docker stop $(docker ps -q) || true 2025-12-04T09:18:58.9027388Z # Prune all of the docker images 2025-12-04T09:18:58.9027709Z docker system prune -af 2025-12-04T09:18:58.9037677Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:58.9038041Z env: 2025-12-04T09:18:58.9038257Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:58.9038510Z ##[endgroup] 2025-12-04T09:18:58.9344683Z "docker stop" requires at least 1 argument. 2025-12-04T09:18:58.9345188Z See 'docker stop --help'. 2025-12-04T09:18:58.9345420Z 2025-12-04T09:18:58.9345653Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2025-12-04T09:18:58.9345920Z 2025-12-04T09:18:58.9346033Z Stop one or more running containers 2025-12-04T09:18:58.9574776Z Total reclaimed space: 0B 2025-12-04T09:18:58.9775347Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2025-12-04T09:18:58.9775794Z with: 2025-12-04T09:18:58.9776549Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:18:58.9777393Z use-custom-docker-registry: true 2025-12-04T09:18:58.9777710Z docker-build-dir: .ci/docker 2025-12-04T09:18:58.9778011Z docker-build-script: ./build.sh 2025-12-04T09:18:58.9778310Z working-directory: . 2025-12-04T09:18:58.9778654Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T09:18:58.9779050Z force-push: false 2025-12-04T09:18:58.9779275Z env: 2025-12-04T09:18:58.9779482Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:58.9779746Z ##[endgroup] 2025-12-04T09:18:58.9798929Z ##[group]Run set -ex 2025-12-04T09:18:58.9799217Z set -ex 2025-12-04T09:18:58.9799458Z  2025-12-04T09:18:58.9799884Z # If the docker build directory or the build script doesn't exist, the action will 2025-12-04T09:18:58.9800545Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2025-12-04T09:18:58.9801104Z # job could then download the pre-built image as usual 2025-12-04T09:18:58.9801783Z if [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then 2025-12-04T09:18:58.9802421Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:58.9802741Z else 2025-12-04T09:18:58.9803014Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:58.9803460Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:58.9803867Z  2025-12-04T09:18:58.9804403Z  echo "Not using custom ECR registry. Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..." 2025-12-04T09:18:58.9805036Z  exit 0 2025-12-04T09:18:58.9805270Z fi 2025-12-04T09:18:58.9805477Z  2025-12-04T09:18:58.9805827Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2025-12-04T09:18:58.9806427Z  # The docker image name already includes the ECR prefix and tag, so we can just 2025-12-04T09:18:58.9806958Z  # use it as it is, but first let's extract the tag 2025-12-04T09:18:58.9807435Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2025-12-04T09:18:58.9807943Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:58.9808426Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:58.9808826Z else 2025-12-04T09:18:58.9809088Z  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then 2025-12-04T09:18:58.9809477Z  CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:} 2025-12-04T09:18:58.9810050Z  DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*} 2025-12-04T09:18:58.9810376Z  fi 2025-12-04T09:18:58.9810828Z  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2025-12-04T09:18:58.9811428Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:58.9812052Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:58.9812736Z  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:58.9813150Z fi 2025-12-04T09:18:58.9822229Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:58.9822679Z env: 2025-12-04T09:18:58.9822896Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:58.9823158Z REPO_NAME: pytorch 2025-12-04T09:18:58.9824077Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:18:58.9825389Z DOCKER_BUILD_DIR: .ci/docker 2025-12-04T09:18:58.9825678Z DOCKER_BUILD_SCRIPT: ./build.sh 2025-12-04T09:18:58.9826053Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T09:18:58.9826441Z USE_CUSTOM_DOCKER_REGISTRY: true 2025-12-04T09:18:58.9826727Z CUSTOM_TAG_PREFIX: 2025-12-04T09:18:58.9826969Z ##[endgroup] 2025-12-04T09:18:58.9858939Z + [[ -d .ci/docker ]] 2025-12-04T09:18:58.9859300Z + [[ -f .ci/docker/./build.sh ]] 2025-12-04T09:18:58.9859617Z + [[ true == \t\r\u\e ]] 2025-12-04T09:18:58.9859904Z + echo skip=false 2025-12-04T09:18:58.9860972Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2025-12-04T09:18:58.9867847Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:18:58.9868738Z ++ awk -F '[:,]' '{print $2}' 2025-12-04T09:18:58.9895044Z + DOCKER_TAG=pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:18:58.9896180Z + echo docker-tag=pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:18:58.9897573Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:18:58.9922975Z ##[group]Run set +e 2025-12-04T09:18:58.9923284Z set +e 2025-12-04T09:18:58.9923521Z set -x 2025-12-04T09:18:58.9923747Z  2025-12-04T09:18:58.9923962Z login() { 2025-12-04T09:18:58.9924804Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-12-04T09:18:58.9925333Z } 2025-12-04T09:18:58.9925564Z  2025-12-04T09:18:58.9925771Z retry () { 2025-12-04T09:18:58.9926049Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-12-04T09:18:58.9926361Z } 2025-12-04T09:18:58.9926568Z  2025-12-04T09:18:58.9926816Z retry login "${DOCKER_REGISTRY}" 2025-12-04T09:18:58.9927127Z  2025-12-04T09:18:58.9927352Z START_TIME=$(date +%s) 2025-12-04T09:18:58.9927645Z # Wait up to 120 minutes 2025-12-04T09:18:58.9928016Z while [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do 2025-12-04T09:18:58.9928504Z  # Check if image already exists, if it does then skip building it 2025-12-04T09:18:58.9928990Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2025-12-04T09:18:58.9929365Z  exit 0 2025-12-04T09:18:58.9929606Z  fi 2025-12-04T09:18:58.9929818Z  2025-12-04T09:18:58.9930401Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2025-12-04T09:18:58.9931059Z  # use this to differentiate between the Docker build and regular build jobs. For the 2025-12-04T09:18:58.9931704Z  # latter, it will wait for the Docker images to become available before continuing 2025-12-04T09:18:58.9932210Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2025-12-04T09:18:58.9932614Z  # It's a Docker build job, let's build the image 2025-12-04T09:18:58.9932959Z  break 2025-12-04T09:18:58.9933203Z  else 2025-12-04T09:18:58.9933545Z  # It's a regular build job, wait for the image to become available 2025-12-04T09:18:58.9933960Z  sleep 300 2025-12-04T09:18:58.9934216Z  fi 2025-12-04T09:18:58.9934434Z done 2025-12-04T09:18:58.9934656Z  2025-12-04T09:18:58.9935014Z # NB: This part requires a full checkout. Otherwise, the merge base will 2025-12-04T09:18:58.9935750Z # be empty. The default action would be to continue rebuild the image 2025-12-04T09:18:58.9936267Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2025-12-04T09:18:58.9936721Z  # if we're on the base branch then use the parent commit 2025-12-04T09:18:58.9937126Z  MERGE_BASE=$(git rev-parse HEAD~) 2025-12-04T09:18:58.9937436Z else 2025-12-04T09:18:58.9937770Z  # otherwise we're on a PR, so use the most recent base commit 2025-12-04T09:18:58.9938281Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2025-12-04T09:18:58.9938688Z fi 2025-12-04T09:18:58.9938901Z  2025-12-04T09:18:58.9939135Z if [[ -z "${MERGE_BASE}" ]]; then 2025-12-04T09:18:58.9939498Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:58.9939818Z  2025-12-04T09:18:58.9940275Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2025-12-04T09:18:58.9940860Z  exit 0 2025-12-04T09:18:58.9941101Z fi 2025-12-04T09:18:58.9941320Z  2025-12-04T09:18:58.9941631Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2025-12-04T09:18:58.9942300Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2025-12-04T09:18:58.9942931Z  exit 1 2025-12-04T09:18:58.9943169Z fi 2025-12-04T09:18:58.9943396Z  2025-12-04T09:18:58.9943761Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2025-12-04T09:18:58.9944412Z # If no image exists but the hash is the same as the previous hash then we should error out here 2025-12-04T09:18:58.9944989Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2025-12-04T09:18:58.9945654Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2025-12-04T09:18:58.9946400Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2025-12-04T09:18:58.9946851Z fi 2025-12-04T09:18:58.9947067Z  2025-12-04T09:18:58.9947330Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:58.9956636Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:58.9957002Z env: 2025-12-04T09:18:58.9957224Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:58.9968881Z DOCKER_BUILD_DIR: .ci/docker 2025-12-04T09:18:58.9969250Z BASE_REVISION: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:58.9970147Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:18:58.9971232Z DOCKER_TAG: pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:18:58.9972012Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T09:18:58.9972406Z DOCKER_PUSH: 2025-12-04T09:18:58.9972642Z ##[endgroup] 2025-12-04T09:18:59.0004195Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T09:18:59.0004641Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T09:18:59.0008091Z + aws ecr get-login-password --region us-east-1 2025-12-04T09:18:59.0008901Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T09:18:59.5390145Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-12-04T09:18:59.5390741Z Configure a credential helper to remove this warning. See 2025-12-04T09:18:59.5391306Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-12-04T09:18:59.5391681Z 2025-12-04T09:18:59.5391814Z Login Succeeded 2025-12-04T09:18:59.5420016Z ++ date +%s 2025-12-04T09:18:59.5435329Z + START_TIME=1764839939 2025-12-04T09:18:59.5439355Z ++ date +%s 2025-12-04T09:18:59.5453094Z + [[ 1764832739 -lt 1764839939 ]] 2025-12-04T09:18:59.5453989Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:18:59.7732482Z { 2025-12-04T09:18:59.7732781Z "schemaVersion": 2, 2025-12-04T09:18:59.7733323Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2025-12-04T09:18:59.7733809Z "config": { 2025-12-04T09:18:59.7734145Z "mediaType": "application/vnd.docker.container.image.v1+json", 2025-12-04T09:18:59.7734542Z "size": 34864, 2025-12-04T09:18:59.7734943Z "digest": "sha256:add7313791033822205cdb3cf32096534b2cfaa4855bd48119b59000bfe00301" 2025-12-04T09:18:59.7735391Z }, 2025-12-04T09:18:59.7735590Z "layers": [ 2025-12-04T09:18:59.7735808Z { 2025-12-04T09:18:59.7736146Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7736558Z "size": 30447951, 2025-12-04T09:18:59.7737000Z "digest": "sha256:63e5bc7682b85ae57a1221210f64d62e7a90b0a30f19af4ca734b8242ae49d63" 2025-12-04T09:18:59.7737450Z }, 2025-12-04T09:18:59.7737638Z { 2025-12-04T09:18:59.7737963Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7738364Z "size": 1554, 2025-12-04T09:18:59.7738744Z "digest": "sha256:0678d56345c994444b77bb70b1177189d23e794748b1d75ffc45d227c7dea94a" 2025-12-04T09:18:59.7739238Z }, 2025-12-04T09:18:59.7739449Z { 2025-12-04T09:18:59.7739773Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7740176Z "size": 313275661, 2025-12-04T09:18:59.7740602Z "digest": "sha256:45f5c9ddfce78349dff3d5edfbaa0310ae17311f66abdcd7e00fa21b500e801c" 2025-12-04T09:18:59.7741061Z }, 2025-12-04T09:18:59.7741255Z { 2025-12-04T09:18:59.7741581Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7741976Z "size": 787, 2025-12-04T09:18:59.7742378Z "digest": "sha256:086b1df51ac1162d9c45698e9dfaf91c6c222c8bd9ab01797ac8f9344bc8044f" 2025-12-04T09:18:59.7742938Z }, 2025-12-04T09:18:59.7743133Z { 2025-12-04T09:18:59.7743453Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7743900Z "size": 106, 2025-12-04T09:18:59.7744308Z "digest": "sha256:fe8a7b64bf98352f89057bcba66beef2fb44cc05fbd3606abccd8e86cf476234" 2025-12-04T09:18:59.7744763Z }, 2025-12-04T09:18:59.7744955Z { 2025-12-04T09:18:59.7745277Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7745673Z "size": 703, 2025-12-04T09:18:59.7746069Z "digest": "sha256:7680723e9a578033dd106b45784c639f06cc8adb1f5239ec513d9de01087c1af" 2025-12-04T09:18:59.7746510Z }, 2025-12-04T09:18:59.7746699Z { 2025-12-04T09:18:59.7747024Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7747425Z "size": 1216, 2025-12-04T09:18:59.7747822Z "digest": "sha256:9c5027aeeb4e3101f48c1d2e400c387110e1009e42497ee801f1b4b7f7edb5c0" 2025-12-04T09:18:59.7748517Z }, 2025-12-04T09:18:59.7748715Z { 2025-12-04T09:18:59.7749038Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7749434Z "size": 483, 2025-12-04T09:18:59.7749817Z "digest": "sha256:9a56521103600bd37a1e7c1191b5136c2d738c092f8a6701499f7068a32c2628" 2025-12-04T09:18:59.7750257Z }, 2025-12-04T09:18:59.7750441Z { 2025-12-04T09:18:59.7750768Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7751212Z "size": 110361875, 2025-12-04T09:18:59.7751619Z "digest": "sha256:375c4427e9141269458333b1463fdb219e736fd6231ec1c56c625c48437ace77" 2025-12-04T09:18:59.7752054Z }, 2025-12-04T09:18:59.7752282Z { 2025-12-04T09:18:59.7752613Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7753014Z "size": 4961, 2025-12-04T09:18:59.7753416Z "digest": "sha256:a86faaa7dbdd70e678e5ea20072637ee42618921ca8f80ca089f789325d4b0c2" 2025-12-04T09:18:59.7753871Z }, 2025-12-04T09:18:59.7754055Z { 2025-12-04T09:18:59.7754539Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7754941Z "size": 1755, 2025-12-04T09:18:59.7755325Z "digest": "sha256:fb7848686804957915d98f8655ef6da0fe4c521b50a82aefdebf475983505a15" 2025-12-04T09:18:59.7755761Z }, 2025-12-04T09:18:59.7755949Z { 2025-12-04T09:18:59.7756262Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7756748Z "size": 724, 2025-12-04T09:18:59.7757149Z "digest": "sha256:3541df015cdb7e8925273399d28e56c31b3c9196f00439ac2925537b173b1f84" 2025-12-04T09:18:59.7757577Z }, 2025-12-04T09:18:59.7757774Z { 2025-12-04T09:18:59.7758103Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7758502Z "size": 543, 2025-12-04T09:18:59.7758889Z "digest": "sha256:79dc80f426b29d4ae9157b967050b03e66aa0c4b1295b944a1dd70106be87066" 2025-12-04T09:18:59.7759335Z }, 2025-12-04T09:18:59.7759539Z { 2025-12-04T09:18:59.7759862Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7760276Z "size": 3185190117, 2025-12-04T09:18:59.7760700Z "digest": "sha256:a13fcc1b90bb9c251ebe7ef2a03c4cb3afa1c8bdafe84f5f85136773059a3735" 2025-12-04T09:18:59.7761156Z }, 2025-12-04T09:18:59.7761341Z { 2025-12-04T09:18:59.7761665Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7762059Z "size": 32, 2025-12-04T09:18:59.7762459Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T09:18:59.7762907Z }, 2025-12-04T09:18:59.7763093Z { 2025-12-04T09:18:59.7763415Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7763816Z "size": 396, 2025-12-04T09:18:59.7764207Z "digest": "sha256:549db4d6c618ecd9534658a233e3c90508f82d8735f965c2786b2eaa078869e5" 2025-12-04T09:18:59.7764642Z }, 2025-12-04T09:18:59.7764830Z { 2025-12-04T09:18:59.7765156Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7765855Z "size": 236860, 2025-12-04T09:18:59.7766286Z "digest": "sha256:5c63528cb580001e65104f4cb0809bf0673a00f989a7db42fd6d86aa1ec27cee" 2025-12-04T09:18:59.7766728Z }, 2025-12-04T09:18:59.7766916Z { 2025-12-04T09:18:59.7767249Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7767658Z "size": 231, 2025-12-04T09:18:59.7768050Z "digest": "sha256:75bd83b989a44e4d4119a3f972891025eb0e9ce95cfbe4a0ca5cdbe7130028d6" 2025-12-04T09:18:59.7768499Z }, 2025-12-04T09:18:59.7768691Z { 2025-12-04T09:18:59.7769019Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7769427Z "size": 3043497, 2025-12-04T09:18:59.7769836Z "digest": "sha256:de6e78970f517178cb91f36cd02bd9ca7b72a08fb82a0f9007516026f258c035" 2025-12-04T09:18:59.7770283Z }, 2025-12-04T09:18:59.7770470Z { 2025-12-04T09:18:59.7770790Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7771344Z "size": 1472, 2025-12-04T09:18:59.7771751Z "digest": "sha256:e13ed7c7e4736e81dc21af755b3363eb26e4d3b2f1ca988dfe65effa47d8fa42" 2025-12-04T09:18:59.7772218Z }, 2025-12-04T09:18:59.7772412Z { 2025-12-04T09:18:59.7772731Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7773139Z "size": 481, 2025-12-04T09:18:59.7773562Z "digest": "sha256:6e2949bcb74152577a0f20c38bcb6dd80f5e68427e3e531a80e08c9ecc73a979" 2025-12-04T09:18:59.7774012Z }, 2025-12-04T09:18:59.7774195Z { 2025-12-04T09:18:59.7774523Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7774929Z "size": 202, 2025-12-04T09:18:59.7775345Z "digest": "sha256:14d69d9aaec70287efd2fd35c4f93e43a29a4098458cc9fca1c93f02ad7356cb" 2025-12-04T09:18:59.7775788Z }, 2025-12-04T09:18:59.7775979Z { 2025-12-04T09:18:59.7776312Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7776714Z "size": 607, 2025-12-04T09:18:59.7777221Z "digest": "sha256:5c02769dd8e5bba2f7f5fd84bde9595fcb3bdbffcae497503fa846f9b5e78bf5" 2025-12-04T09:18:59.7777688Z }, 2025-12-04T09:18:59.7777877Z { 2025-12-04T09:18:59.7778205Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7778615Z "size": 7889619584, 2025-12-04T09:18:59.7779033Z "digest": "sha256:35041ce524ac4afec40ecd73b1393c830614f1f79d43a6439767a6c7d5b7027b" 2025-12-04T09:18:59.7779493Z }, 2025-12-04T09:18:59.7779694Z { 2025-12-04T09:18:59.7780018Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7780430Z "size": 830, 2025-12-04T09:18:59.7780821Z "digest": "sha256:2fa92dc5885e080e049ceb4139288b6c0e39fab34256945708b08ea55a1f7a0b" 2025-12-04T09:18:59.7781316Z }, 2025-12-04T09:18:59.7781500Z { 2025-12-04T09:18:59.7781827Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7782235Z "size": 33451739, 2025-12-04T09:18:59.7782741Z "digest": "sha256:2b85eafbd92a0e70a0a70154ad8bf4584095e576d95873368f30373f5966714a" 2025-12-04T09:18:59.7783191Z }, 2025-12-04T09:18:59.7783384Z { 2025-12-04T09:18:59.7783702Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7784103Z "size": 104, 2025-12-04T09:18:59.7784510Z "digest": "sha256:ff755a4ddad7880f23c6b767d432d6f1eafdb62b3ea18f8a98e22c441c099fcb" 2025-12-04T09:18:59.7784962Z }, 2025-12-04T09:18:59.7785158Z { 2025-12-04T09:18:59.7785484Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7785881Z "size": 1496, 2025-12-04T09:18:59.7786277Z "digest": "sha256:09eb41bdf42d8605b57b2363348154140904dec914b34a67298b82122bfce2b3" 2025-12-04T09:18:59.7786720Z }, 2025-12-04T09:18:59.7786909Z { 2025-12-04T09:18:59.7787225Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7787632Z "size": 458787828, 2025-12-04T09:18:59.7788046Z "digest": "sha256:11ede4d59e935e62f41b33220fe871794ab5e57ce724173b713368977683bcf6" 2025-12-04T09:18:59.7788491Z }, 2025-12-04T09:18:59.7788686Z { 2025-12-04T09:18:59.7789010Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7789408Z "size": 164, 2025-12-04T09:18:59.7789807Z "digest": "sha256:1283cd8f801a142172f3ab76fd472df8583223d9437de3e4d18d8cf98ea3fa98" 2025-12-04T09:18:59.7790255Z }, 2025-12-04T09:18:59.7790441Z { 2025-12-04T09:18:59.7790775Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7791177Z "size": 346, 2025-12-04T09:18:59.7791592Z "digest": "sha256:024fa855425fa524ad4500660cf61d53be62b99556d31b8b280d14caba434a35" 2025-12-04T09:18:59.7792048Z }, 2025-12-04T09:18:59.7792245Z { 2025-12-04T09:18:59.7792578Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7792978Z "size": 32, 2025-12-04T09:18:59.7793391Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T09:18:59.7793941Z }, 2025-12-04T09:18:59.7794136Z { 2025-12-04T09:18:59.7794483Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7794905Z "size": 106, 2025-12-04T09:18:59.7795322Z "digest": "sha256:303e6747a62efecf5efa1f97d0e66b40a3b39da8d79a51f75b89f4c92ae7ec52" 2025-12-04T09:18:59.7795795Z }, 2025-12-04T09:18:59.7796000Z { 2025-12-04T09:18:59.7796331Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7796750Z "size": 424, 2025-12-04T09:18:59.7797171Z "digest": "sha256:3017cdf4838bcc9a33daebc07487f8ae1f6bd6e7ce8322c14f5480e8db9ef90e" 2025-12-04T09:18:59.7797638Z }, 2025-12-04T09:18:59.7797827Z { 2025-12-04T09:18:59.7798157Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7798575Z "size": 19309374, 2025-12-04T09:18:59.7798996Z "digest": "sha256:6b6cd1c358e886dc6ed7fd46ac4bcc1a0a73b7b1301739ea1953478ee5d83f50" 2025-12-04T09:18:59.7799468Z }, 2025-12-04T09:18:59.7799667Z { 2025-12-04T09:18:59.7800116Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7800523Z "size": 108, 2025-12-04T09:18:59.7800919Z "digest": "sha256:b2dd045011241d1cf8889e2a7369d9fe4844dfe15529b520ccd6a59bd3c1532e" 2025-12-04T09:18:59.7801400Z }, 2025-12-04T09:18:59.7801595Z { 2025-12-04T09:18:59.7801922Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7802326Z "size": 827, 2025-12-04T09:18:59.7802720Z "digest": "sha256:55adc51fe5897031d4cf2f2b8fd162213f6e46a52848630c616606271b97952e" 2025-12-04T09:18:59.7803166Z }, 2025-12-04T09:18:59.7803356Z { 2025-12-04T09:18:59.7803675Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7804082Z "size": 724, 2025-12-04T09:18:59.7804463Z "digest": "sha256:3541df015cdb7e8925273399d28e56c31b3c9196f00439ac2925537b173b1f84" 2025-12-04T09:18:59.7804901Z }, 2025-12-04T09:18:59.7805085Z { 2025-12-04T09:18:59.7805418Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7805820Z "size": 149, 2025-12-04T09:18:59.7806214Z "digest": "sha256:a43ca0e4b837964b12b7469194cfe939c26de027298040028975324dce25938a" 2025-12-04T09:18:59.7806656Z }, 2025-12-04T09:18:59.7806842Z { 2025-12-04T09:18:59.7807167Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7807567Z "size": 138, 2025-12-04T09:18:59.7807966Z "digest": "sha256:b7212f17fd1404837fcfdd086dd0e2667931e4db377d45d8d89a44390c84e11d" 2025-12-04T09:18:59.7808407Z }, 2025-12-04T09:18:59.7808610Z { 2025-12-04T09:18:59.7808940Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7809343Z "size": 141, 2025-12-04T09:18:59.7809743Z "digest": "sha256:083e42cac090e6486c35f392b64ee54448f5e4aa947003aeb3e1f92c8ea5c099" 2025-12-04T09:18:59.7810196Z }, 2025-12-04T09:18:59.7810388Z { 2025-12-04T09:18:59.7810716Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7811132Z "size": 32, 2025-12-04T09:18:59.7811534Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T09:18:59.7811999Z }, 2025-12-04T09:18:59.7812201Z { 2025-12-04T09:18:59.7812527Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7812936Z "size": 223, 2025-12-04T09:18:59.7813345Z "digest": "sha256:0a00b784a4aac341795729b254f7edd09e811b7f51d0c58e0e6bfeeee6940503" 2025-12-04T09:18:59.7813801Z }, 2025-12-04T09:18:59.7813986Z { 2025-12-04T09:18:59.7814313Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7814717Z "size": 255, 2025-12-04T09:18:59.7815105Z "digest": "sha256:c6173c779f7ba143a21214ea5f032b141863a37ceb4c0ac01d3248c216ce5241" 2025-12-04T09:18:59.7815551Z }, 2025-12-04T09:18:59.7815749Z { 2025-12-04T09:18:59.7816066Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7816565Z "size": 145520672, 2025-12-04T09:18:59.7817003Z "digest": "sha256:ed3d1e3387b924585c332bf1bc252fa159cd0d25256a874043ff0141b1ab5ff7" 2025-12-04T09:18:59.7817455Z }, 2025-12-04T09:18:59.7817663Z { 2025-12-04T09:18:59.7817998Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7818411Z "size": 106, 2025-12-04T09:18:59.7818813Z "digest": "sha256:b29343478586aeee19d2a622661716f6f1591280c890f49b727a8da13a610784" 2025-12-04T09:18:59.7819262Z }, 2025-12-04T09:18:59.7819467Z { 2025-12-04T09:18:59.7819793Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7820220Z "size": 312293530, 2025-12-04T09:18:59.7820648Z "digest": "sha256:c6f0520487fb506bc4601fd84d5f28d8a76b203e004731e4b2067c2ab1a14e0b" 2025-12-04T09:18:59.7821097Z }, 2025-12-04T09:18:59.7821299Z { 2025-12-04T09:18:59.7821638Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7822064Z "size": 3058011133, 2025-12-04T09:18:59.7822653Z "digest": "sha256:148171691cd4c4d20310d490d4b4dd903490d04ea07fb8f7e668a28768683e9a" 2025-12-04T09:18:59.7823118Z }, 2025-12-04T09:18:59.7823311Z { 2025-12-04T09:18:59.7823639Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7824058Z "size": 129, 2025-12-04T09:18:59.7824775Z "digest": "sha256:2c666d30ed77fff9ff1167d41cd645dad98280fcbe941f5bc3828c7ae66b1287" 2025-12-04T09:18:59.7825236Z }, 2025-12-04T09:18:59.7825427Z { 2025-12-04T09:18:59.7825754Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7826146Z "size": 880, 2025-12-04T09:18:59.7826549Z "digest": "sha256:5d8d3a0a98e012c5068e0f3bae5a03e3148ecf2d063634eee4c9241a1e3fdfb5" 2025-12-04T09:18:59.7827006Z }, 2025-12-04T09:18:59.7827194Z { 2025-12-04T09:18:59.7827516Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7827929Z "size": 724, 2025-12-04T09:18:59.7828331Z "digest": "sha256:3541df015cdb7e8925273399d28e56c31b3c9196f00439ac2925537b173b1f84" 2025-12-04T09:18:59.7828770Z }, 2025-12-04T09:18:59.7828972Z { 2025-12-04T09:18:59.7829293Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7829706Z "size": 139, 2025-12-04T09:18:59.7830112Z "digest": "sha256:b06bafce9e817295d8127207747c80aa18e04392ff0875844fc30a1e794a8a0c" 2025-12-04T09:18:59.7830562Z }, 2025-12-04T09:18:59.7830744Z { 2025-12-04T09:18:59.7831081Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7831495Z "size": 32, 2025-12-04T09:18:59.7831891Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T09:18:59.7832351Z }, 2025-12-04T09:18:59.7832551Z { 2025-12-04T09:18:59.7832875Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7833276Z "size": 159, 2025-12-04T09:18:59.7833677Z "digest": "sha256:15e0d7e4590d3d8f598d05aec3a92f891bf8b4605bcc38cc2de852b6014ef8f3" 2025-12-04T09:18:59.7834130Z }, 2025-12-04T09:18:59.7834321Z { 2025-12-04T09:18:59.7834654Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7835055Z "size": 1011, 2025-12-04T09:18:59.7835457Z "digest": "sha256:a514bd1add3164d8d7ca99aa19294c4ed8b97b074635d98714c4f598a959f4cd" 2025-12-04T09:18:59.7835912Z }, 2025-12-04T09:18:59.7836105Z { 2025-12-04T09:18:59.7836423Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7836820Z "size": 724, 2025-12-04T09:18:59.7837209Z "digest": "sha256:3541df015cdb7e8925273399d28e56c31b3c9196f00439ac2925537b173b1f84" 2025-12-04T09:18:59.7837647Z }, 2025-12-04T09:18:59.7837834Z { 2025-12-04T09:18:59.7838157Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7838548Z "size": 134, 2025-12-04T09:18:59.7838945Z "digest": "sha256:57b84ee6000204f27a1d9bca199b19be4c86ecd324540dbdf239c56a6c3b34ea" 2025-12-04T09:18:59.7839553Z }, 2025-12-04T09:18:59.7839731Z { 2025-12-04T09:18:59.7840063Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7840483Z "size": 32, 2025-12-04T09:18:59.7840892Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T09:18:59.7841338Z }, 2025-12-04T09:18:59.7841532Z { 2025-12-04T09:18:59.7841859Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7842278Z "size": 157, 2025-12-04T09:18:59.7842711Z "digest": "sha256:b8babeff6d817a5961dddc15c6bdfdbd05da187fae75d5804015f99fd7c066d8" 2025-12-04T09:18:59.7843190Z }, 2025-12-04T09:18:59.7843384Z { 2025-12-04T09:18:59.7843722Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7844148Z "size": 602, 2025-12-04T09:18:59.7844560Z "digest": "sha256:83779ddf6a85ab387f64a45f274cba245b69e4fd1931ff0b5d7d3efd4b7a43bc" 2025-12-04T09:18:59.7845030Z }, 2025-12-04T09:18:59.7845240Z { 2025-12-04T09:18:59.7845703Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7857752Z "size": 724, 2025-12-04T09:18:59.7858191Z "digest": "sha256:3541df015cdb7e8925273399d28e56c31b3c9196f00439ac2925537b173b1f84" 2025-12-04T09:18:59.7858635Z }, 2025-12-04T09:18:59.7858825Z { 2025-12-04T09:18:59.7859155Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7859567Z "size": 155, 2025-12-04T09:18:59.7859969Z "digest": "sha256:8b7620c0d736cc79381207ce5afe2af90f0cd7f0cd394577d2c9520d7f74762f" 2025-12-04T09:18:59.7860415Z }, 2025-12-04T09:18:59.7860604Z { 2025-12-04T09:18:59.7860934Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7861336Z "size": 32, 2025-12-04T09:18:59.7861737Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T09:18:59.7862190Z }, 2025-12-04T09:18:59.7862369Z { 2025-12-04T09:18:59.7862767Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7863170Z "size": 188, 2025-12-04T09:18:59.7863568Z "digest": "sha256:3bcfa090e4efd3677425f76baea9f1e0c50a75d8c6b5713ec05310f1dff24539" 2025-12-04T09:18:59.7864018Z }, 2025-12-04T09:18:59.7864198Z { 2025-12-04T09:18:59.7864512Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7864910Z "size": 1370, 2025-12-04T09:18:59.7865317Z "digest": "sha256:eb0504ec4d9218a79896b604f73dc0ea5a0f96266ad9c2cdbbbe5f0f18222694" 2025-12-04T09:18:59.7865768Z }, 2025-12-04T09:18:59.7865950Z { 2025-12-04T09:18:59.7866273Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7866671Z "size": 32, 2025-12-04T09:18:59.7867061Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T09:18:59.7867515Z }, 2025-12-04T09:18:59.7867704Z { 2025-12-04T09:18:59.7868018Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7868423Z "size": 136, 2025-12-04T09:18:59.7868825Z "digest": "sha256:15d0fec09d7b196a1462d51516ee90fc3443ba178d3e56d59cacf32146b4321d" 2025-12-04T09:18:59.7869268Z }, 2025-12-04T09:18:59.7869458Z { 2025-12-04T09:18:59.7869780Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7870174Z "size": 528, 2025-12-04T09:18:59.7870576Z "digest": "sha256:cca81fcc62a949959ca4dd3c9056fb293d548ef8607127eeeef6cfd3a8897ca8" 2025-12-04T09:18:59.7871030Z }, 2025-12-04T09:18:59.7871225Z { 2025-12-04T09:18:59.7871538Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7871942Z "size": 32, 2025-12-04T09:18:59.7872351Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T09:18:59.7872792Z }, 2025-12-04T09:18:59.7872985Z { 2025-12-04T09:18:59.7873311Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7873828Z "size": 104, 2025-12-04T09:18:59.7874248Z "digest": "sha256:b0b8f9b5c6ab98db9cd830dc584e1b6aec9add139e4cc48d8c243d36691e25b4" 2025-12-04T09:18:59.7874713Z }, 2025-12-04T09:18:59.7874894Z { 2025-12-04T09:18:59.7875220Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7875622Z "size": 435, 2025-12-04T09:18:59.7876031Z "digest": "sha256:0606ca4d47a8a70e91e92b03ca51a85e731641b09342136a54ef2f2a6d9dfb44" 2025-12-04T09:18:59.7876469Z }, 2025-12-04T09:18:59.7876663Z { 2025-12-04T09:18:59.7876992Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7877389Z "size": 32, 2025-12-04T09:18:59.7877793Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T09:18:59.7878248Z }, 2025-12-04T09:18:59.7878429Z { 2025-12-04T09:18:59.7878748Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7879147Z "size": 109, 2025-12-04T09:18:59.7879643Z "digest": "sha256:2f80a4e1b3b95ed67bb781ea787e8a63e46de79117d9d8e65c257072b38afa2d" 2025-12-04T09:18:59.7880097Z }, 2025-12-04T09:18:59.7880285Z { 2025-12-04T09:18:59.7880602Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7881001Z "size": 1896, 2025-12-04T09:18:59.7881398Z "digest": "sha256:35c916fb1bd057e517dcab78c3a2a018e68096d8993892ad84f47562d37ae352" 2025-12-04T09:18:59.7881892Z }, 2025-12-04T09:18:59.7882069Z { 2025-12-04T09:18:59.7882391Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7882796Z "size": 197526165, 2025-12-04T09:18:59.7883193Z "digest": "sha256:195537b7dafc96192f768323b1a8cc2a914d41959849b73198579576b0872a44" 2025-12-04T09:18:59.7883629Z }, 2025-12-04T09:18:59.7883815Z { 2025-12-04T09:18:59.7884136Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7884539Z "size": 106, 2025-12-04T09:18:59.7884929Z "digest": "sha256:dc454fd3967e5735b2498b7f1d958a2c626987d5e4ce225ca98da3cd945b59f3" 2025-12-04T09:18:59.7885375Z }, 2025-12-04T09:18:59.7885561Z { 2025-12-04T09:18:59.7885891Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7886283Z "size": 165, 2025-12-04T09:18:59.7886690Z "digest": "sha256:701b34f115fa897181c046dc37288e87cbc3ad74c36a9e2224b5bfe7c5703afb" 2025-12-04T09:18:59.7887137Z }, 2025-12-04T09:18:59.7887323Z { 2025-12-04T09:18:59.7887638Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7888033Z "size": 7944, 2025-12-04T09:18:59.7888432Z "digest": "sha256:39cefc00ffedebc9098261c798408b87a20c95a88fccb110594077f48dadf760" 2025-12-04T09:18:59.7888869Z }, 2025-12-04T09:18:59.7889057Z { 2025-12-04T09:18:59.7889384Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7889780Z "size": 8071, 2025-12-04T09:18:59.7890182Z "digest": "sha256:6ae51eb61a325b2c2995a5088c81aa20821b75be65b5aa722c7c40556b5d03ea" 2025-12-04T09:18:59.7890636Z }, 2025-12-04T09:18:59.7890827Z { 2025-12-04T09:18:59.7891149Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7891545Z "size": 304, 2025-12-04T09:18:59.7891949Z "digest": "sha256:1fd5341e66dfc0c1ae23af014641a92a6fd02640c528fe6d4dc55921ed659a26" 2025-12-04T09:18:59.7892393Z }, 2025-12-04T09:18:59.7892586Z { 2025-12-04T09:18:59.7892916Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7893307Z "size": 13364291, 2025-12-04T09:18:59.7893718Z "digest": "sha256:72a7c87e35e40ab796f90aee1b51add7902f0cdc44406d2505b6c6a1f55a8da6" 2025-12-04T09:18:59.7894164Z }, 2025-12-04T09:18:59.7894351Z { 2025-12-04T09:18:59.7894671Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7895071Z "size": 108, 2025-12-04T09:18:59.7895472Z "digest": "sha256:ec36862ac98ebaac52ee1a8b1d162d45bd0e3bf59ae7e19c8f80ad3960b4c600" 2025-12-04T09:18:59.7896011Z }, 2025-12-04T09:18:59.7896197Z { 2025-12-04T09:18:59.7896521Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7896918Z "size": 54145699, 2025-12-04T09:18:59.7897326Z "digest": "sha256:05ddbf246e8add0e293474dbf88bb028d5a295a25ac59e8648a18db644377773" 2025-12-04T09:18:59.7897773Z }, 2025-12-04T09:18:59.7897952Z { 2025-12-04T09:18:59.7898276Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T09:18:59.7898676Z "size": 32, 2025-12-04T09:18:59.7899064Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T09:18:59.7899519Z } 2025-12-04T09:18:59.7899713Z ] 2025-12-04T09:18:59.7899895Z } 2025-12-04T09:18:59.7900100Z + exit 0 2025-12-04T09:18:59.7927742Z ##[group]Run set -eux 2025-12-04T09:18:59.7928411Z set -eux 2025-12-04T09:18:59.7928829Z # It's ok if this steps fails, it would then be an anonymous user like what we used to have 2025-12-04T09:18:59.7930063Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin || true 2025-12-04T09:18:59.7939723Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:59.7940086Z env: 2025-12-04T09:18:59.7940304Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:59.7940569Z ##[endgroup] 2025-12-04T09:18:59.7974158Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2025-12-04T09:18:59.7974852Z + jq --raw-output .SecretString 2025-12-04T09:18:59.7976719Z + jq -r .docker_hub_readonly_token 2025-12-04T09:18:59.7978426Z + docker login --username pytorchbot --password-stdin 2025-12-04T09:19:00.3652702Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-12-04T09:19:00.3653282Z Configure a credential helper to remove this warning. See 2025-12-04T09:19:00.3653843Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-12-04T09:19:00.3654243Z 2025-12-04T09:19:00.3658551Z Login Succeeded 2025-12-04T09:19:00.3778032Z ##[group]Run tag=${ECR_DOCKER_IMAGE##*:} 2025-12-04T09:19:00.3778408Z tag=${ECR_DOCKER_IMAGE##*:} 2025-12-04T09:19:00.3778816Z echo "docker pull ghcr.io/pytorch/ci-image:${tag/:/-}" 2025-12-04T09:19:00.3788159Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:19:00.3788524Z env: 2025-12-04T09:19:00.3788746Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:19:00.3789536Z ECR_DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:19:00.3790351Z ##[endgroup] 2025-12-04T09:19:00.3822839Z docker pull ghcr.io/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:19:00.3882024Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2025-12-04T09:19:00.3882491Z with: 2025-12-04T09:19:00.3883222Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:19:00.3884133Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T09:19:00.3884503Z env: 2025-12-04T09:19:00.3884725Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:19:00.3884980Z ##[endgroup] 2025-12-04T09:19:00.3900112Z ##[group]Run set -x 2025-12-04T09:19:00.3900384Z set -x 2025-12-04T09:19:00.3900606Z set +e 2025-12-04T09:19:00.3900829Z  2025-12-04T09:19:00.3901039Z login() { 2025-12-04T09:19:00.3901502Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-12-04T09:19:00.3902005Z } 2025-12-04T09:19:00.3902211Z  2025-12-04T09:19:00.3902548Z retry () { 2025-12-04T09:19:00.3902823Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-12-04T09:19:00.3903320Z } 2025-12-04T09:19:00.3903528Z  2025-12-04T09:19:00.3903766Z retry login "${DOCKER_REGISTRY}" 2025-12-04T09:19:00.3904066Z  2025-12-04T09:19:00.3904542Z IMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024') 2025-12-04T09:19:00.3905179Z echo "Compressed size of image in MB: ${IMAGE_SIZE}" 2025-12-04T09:19:00.3905543Z  2025-12-04T09:19:00.3905754Z set -e 2025-12-04T09:19:00.3906096Z # ignore output since only exit code is used for conditional 2025-12-04T09:19:00.3906579Z # only pull docker image if it's not available locally 2025-12-04T09:19:00.3907115Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2025-12-04T09:19:00.3907615Z  retry docker pull "${DOCKER_IMAGE}" 2025-12-04T09:19:00.3907935Z fi 2025-12-04T09:19:00.3916897Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:19:00.3917273Z env: 2025-12-04T09:19:00.3917494Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:19:00.3918276Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:19:00.3919178Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T09:19:00.3919563Z ##[endgroup] 2025-12-04T09:19:00.3950466Z + set +e 2025-12-04T09:19:00.3950935Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T09:19:00.3951407Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T09:19:00.3955103Z + aws ecr get-login-password --region us-east-1 2025-12-04T09:19:00.3955626Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T09:19:00.9424764Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-12-04T09:19:00.9425366Z Configure a credential helper to remove this warning. See 2025-12-04T09:19:00.9425946Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-12-04T09:19:00.9426323Z 2025-12-04T09:19:00.9426825Z Login Succeeded 2025-12-04T09:19:00.9460574Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:19:00.9461525Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024' 2025-12-04T09:19:01.1671621Z + IMAGE_SIZE=15091.581844329834 2025-12-04T09:19:01.1671994Z Compressed size of image in MB: 15091.581844329834 2025-12-04T09:19:01.1672424Z + echo 'Compressed size of image in MB: 15091.581844329834' 2025-12-04T09:19:01.1672781Z + set -e 2025-12-04T09:19:01.1673943Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:19:01.1807762Z + retry docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:19:01.1810449Z + docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:19:01.4343059Z pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a: Pulling from pytorch/ci-image 2025-12-04T09:19:01.4344137Z 63e5bc7682b8: Pulling fs layer 2025-12-04T09:19:01.4344545Z 0678d56345c9: Pulling fs layer 2025-12-04T09:19:01.4344958Z 45f5c9ddfce7: Pulling fs layer 2025-12-04T09:19:01.4345356Z 086b1df51ac1: Pulling fs layer 2025-12-04T09:19:01.4345762Z fe8a7b64bf98: Pulling fs layer 2025-12-04T09:19:01.4346176Z 7680723e9a57: Pulling fs layer 2025-12-04T09:19:01.4346594Z 9c5027aeeb4e: Pulling fs layer 2025-12-04T09:19:01.4346989Z 9a5652110360: Pulling fs layer 2025-12-04T09:19:01.4347389Z 375c4427e914: Pulling fs layer 2025-12-04T09:19:01.4348132Z a86faaa7dbdd: Pulling fs layer 2025-12-04T09:19:01.4348538Z fb7848686804: Pulling fs layer 2025-12-04T09:19:01.4348944Z 3541df015cdb: Pulling fs layer 2025-12-04T09:19:01.4349341Z 79dc80f426b2: Pulling fs layer 2025-12-04T09:19:01.4349742Z a13fcc1b90bb: Pulling fs layer 2025-12-04T09:19:01.4350143Z 4f4fb700ef54: Pulling fs layer 2025-12-04T09:19:01.4350549Z 9c5027aeeb4e: Waiting 2025-12-04T09:19:01.4350898Z 7680723e9a57: Waiting 2025-12-04T09:19:01.4351251Z 549db4d6c618: Pulling fs layer 2025-12-04T09:19:01.4351647Z 5c63528cb580: Pulling fs layer 2025-12-04T09:19:01.4352043Z 75bd83b989a4: Pulling fs layer 2025-12-04T09:19:01.4352452Z de6e78970f51: Pulling fs layer 2025-12-04T09:19:01.4352849Z e13ed7c7e473: Pulling fs layer 2025-12-04T09:19:01.4353249Z 6e2949bcb741: Pulling fs layer 2025-12-04T09:19:01.4353626Z 14d69d9aaec7: Pulling fs layer 2025-12-04T09:19:01.4353969Z fe8a7b64bf98: Waiting 2025-12-04T09:19:01.4354210Z a86faaa7dbdd: Waiting 2025-12-04T09:19:01.4354477Z 5c02769dd8e5: Pulling fs layer 2025-12-04T09:19:01.4354811Z fb7848686804: Waiting 2025-12-04T09:19:01.4355047Z 79dc80f426b2: Waiting 2025-12-04T09:19:01.4355294Z 35041ce524ac: Pulling fs layer 2025-12-04T09:19:01.4355631Z a13fcc1b90bb: Waiting 2025-12-04T09:19:01.4355891Z 2fa92dc5885e: Pulling fs layer 2025-12-04T09:19:01.4356183Z 2b85eafbd92a: Pulling fs layer 2025-12-04T09:19:01.4356517Z 375c4427e914: Waiting 2025-12-04T09:19:01.4356751Z 549db4d6c618: Waiting 2025-12-04T09:19:01.4356995Z 3541df015cdb: Waiting 2025-12-04T09:19:01.4357312Z ff755a4ddad7: Pulling fs layer 2025-12-04T09:19:01.4357605Z 09eb41bdf42d: Pulling fs layer 2025-12-04T09:19:01.4357893Z 11ede4d59e93: Pulling fs layer 2025-12-04T09:19:01.4358231Z 6e2949bcb741: Waiting 2025-12-04T09:19:01.4358468Z 5c63528cb580: Waiting 2025-12-04T09:19:01.4358716Z 1283cd8f801a: Pulling fs layer 2025-12-04T09:19:01.4359059Z 14d69d9aaec7: Waiting 2025-12-04T09:19:01.4359297Z 9a5652110360: Waiting 2025-12-04T09:19:01.4359547Z 024fa855425f: Pulling fs layer 2025-12-04T09:19:01.4359882Z 086b1df51ac1: Waiting 2025-12-04T09:19:01.4360127Z 5c02769dd8e5: Waiting 2025-12-04T09:19:01.4360365Z e13ed7c7e473: Waiting 2025-12-04T09:19:01.4360617Z 11ede4d59e93: Waiting 2025-12-04T09:19:01.4360849Z ff755a4ddad7: Waiting 2025-12-04T09:19:01.4361100Z 303e6747a62e: Pulling fs layer 2025-12-04T09:19:01.4361382Z 3017cdf4838b: Pulling fs layer 2025-12-04T09:19:01.4361676Z 4f4fb700ef54: Waiting 2025-12-04T09:19:01.4361928Z 75bd83b989a4: Waiting 2025-12-04T09:19:01.4362186Z 2b85eafbd92a: Waiting 2025-12-04T09:19:01.4362418Z 1283cd8f801a: Waiting 2025-12-04T09:19:01.4362643Z 024fa855425f: Waiting 2025-12-04T09:19:01.4362890Z 6b6cd1c358e8: Pulling fs layer 2025-12-04T09:19:01.4363172Z b2dd04501124: Pulling fs layer 2025-12-04T09:19:01.4363443Z 55adc51fe589: Pulling fs layer 2025-12-04T09:19:01.4363724Z a43ca0e4b837: Pulling fs layer 2025-12-04T09:19:01.4364287Z b7212f17fd14: Pulling fs layer 2025-12-04T09:19:01.4364660Z 083e42cac090: Pulling fs layer 2025-12-04T09:19:01.4364933Z 6b6cd1c358e8: Waiting 2025-12-04T09:19:01.4365188Z 0a00b784a4aa: Pulling fs layer 2025-12-04T09:19:01.4365446Z b2dd04501124: Waiting 2025-12-04T09:19:01.4365672Z 55adc51fe589: Waiting 2025-12-04T09:19:01.4365919Z c6173c779f7b: Pulling fs layer 2025-12-04T09:19:01.4366176Z b7212f17fd14: Waiting 2025-12-04T09:19:01.4366399Z 083e42cac090: Waiting 2025-12-04T09:19:01.4366628Z 3017cdf4838b: Waiting 2025-12-04T09:19:01.4366870Z ed3d1e3387b9: Pulling fs layer 2025-12-04T09:19:01.4367126Z c6173c779f7b: Waiting 2025-12-04T09:19:01.4367366Z b29343478586: Pulling fs layer 2025-12-04T09:19:01.4367621Z ed3d1e3387b9: Waiting 2025-12-04T09:19:01.4367861Z c6f0520487fb: Pulling fs layer 2025-12-04T09:19:01.4368139Z 148171691cd4: Pulling fs layer 2025-12-04T09:19:01.4368398Z 0a00b784a4aa: Waiting 2025-12-04T09:19:01.4368622Z 35041ce524ac: Waiting 2025-12-04T09:19:01.4368849Z b29343478586: Waiting 2025-12-04T09:19:01.4369097Z 2c666d30ed77: Pulling fs layer 2025-12-04T09:19:01.4369353Z c6f0520487fb: Waiting 2025-12-04T09:19:01.4369677Z 148171691cd4: Waiting 2025-12-04T09:19:01.4369921Z 5d8d3a0a98e0: Pulling fs layer 2025-12-04T09:19:01.4370199Z b06bafce9e81: Pulling fs layer 2025-12-04T09:19:01.4370461Z 5d8d3a0a98e0: Waiting 2025-12-04T09:19:01.4370692Z 2fa92dc5885e: Waiting 2025-12-04T09:19:01.4370935Z 15e0d7e4590d: Pulling fs layer 2025-12-04T09:19:01.4371191Z b06bafce9e81: Waiting 2025-12-04T09:19:01.4371440Z a514bd1add31: Pulling fs layer 2025-12-04T09:19:01.4371702Z a43ca0e4b837: Waiting 2025-12-04T09:19:01.4371959Z 2c666d30ed77: Waiting 2025-12-04T09:19:01.4372238Z 15e0d7e4590d: Waiting 2025-12-04T09:19:01.4372547Z a514bd1add31: Waiting 2025-12-04T09:19:01.4372860Z 57b84ee60002: Pulling fs layer 2025-12-04T09:19:01.4373170Z 303e6747a62e: Waiting 2025-12-04T09:19:01.4373403Z 57b84ee60002: Waiting 2025-12-04T09:19:01.4373632Z de6e78970f51: Waiting 2025-12-04T09:19:01.4373890Z b8babeff6d81: Pulling fs layer 2025-12-04T09:19:01.4374181Z 83779ddf6a85: Pulling fs layer 2025-12-04T09:19:01.4374438Z 83779ddf6a85: Waiting 2025-12-04T09:19:01.4374691Z 8b7620c0d736: Pulling fs layer 2025-12-04T09:19:01.4374952Z 8b7620c0d736: Waiting 2025-12-04T09:19:01.4375182Z 09eb41bdf42d: Waiting 2025-12-04T09:19:01.4375435Z 3bcfa090e4ef: Pulling fs layer 2025-12-04T09:19:01.4375708Z b8babeff6d81: Waiting 2025-12-04T09:19:01.4375955Z eb0504ec4d92: Pulling fs layer 2025-12-04T09:19:01.4376244Z 15d0fec09d7b: Pulling fs layer 2025-12-04T09:19:01.4376527Z cca81fcc62a9: Pulling fs layer 2025-12-04T09:19:01.4376804Z 3bcfa090e4ef: Waiting 2025-12-04T09:19:01.4377039Z 15d0fec09d7b: Waiting 2025-12-04T09:19:01.4377274Z eb0504ec4d92: Waiting 2025-12-04T09:19:01.4377524Z b0b8f9b5c6ab: Pulling fs layer 2025-12-04T09:19:01.4377801Z 0606ca4d47a8: Pulling fs layer 2025-12-04T09:19:01.4378066Z cca81fcc62a9: Waiting 2025-12-04T09:19:01.4378317Z 2f80a4e1b3b9: Pulling fs layer 2025-12-04T09:19:01.4378590Z 35c916fb1bd0: Pulling fs layer 2025-12-04T09:19:01.4378859Z 0606ca4d47a8: Waiting 2025-12-04T09:19:01.4379094Z b0b8f9b5c6ab: Waiting 2025-12-04T09:19:01.4379338Z 195537b7dafc: Pulling fs layer 2025-12-04T09:19:01.4379679Z 35c916fb1bd0: Waiting 2025-12-04T09:19:01.4379938Z dc454fd3967e: Pulling fs layer 2025-12-04T09:19:01.4380246Z 195537b7dafc: Waiting 2025-12-04T09:19:01.4380479Z 2f80a4e1b3b9: Waiting 2025-12-04T09:19:01.4380723Z 701b34f115fa: Pulling fs layer 2025-12-04T09:19:01.4380996Z 39cefc00ffed: Pulling fs layer 2025-12-04T09:19:01.4381273Z 6ae51eb61a32: Pulling fs layer 2025-12-04T09:19:01.4381557Z 1fd5341e66df: Pulling fs layer 2025-12-04T09:19:01.4381840Z 72a7c87e35e4: Pulling fs layer 2025-12-04T09:19:01.4382164Z 701b34f115fa: Waiting 2025-12-04T09:19:01.4382543Z ec36862ac98e: Pulling fs layer 2025-12-04T09:19:01.4382918Z dc454fd3967e: Waiting 2025-12-04T09:19:01.4383227Z 1fd5341e66df: Waiting 2025-12-04T09:19:01.4383486Z 05ddbf246e8a: Pulling fs layer 2025-12-04T09:19:01.4383759Z 39cefc00ffed: Waiting 2025-12-04T09:19:01.4383986Z 72a7c87e35e4: Waiting 2025-12-04T09:19:01.4384342Z ec36862ac98e: Waiting 2025-12-04T09:19:01.4384590Z 6ae51eb61a32: Waiting 2025-12-04T09:19:01.4384823Z 05ddbf246e8a: Waiting 2025-12-04T09:19:01.5219249Z 0678d56345c9: Verifying Checksum 2025-12-04T09:19:01.5219563Z 0678d56345c9: Download complete 2025-12-04T09:19:01.6143396Z 086b1df51ac1: Verifying Checksum 2025-12-04T09:19:01.6143700Z 086b1df51ac1: Download complete 2025-12-04T09:19:01.6870055Z fe8a7b64bf98: Verifying Checksum 2025-12-04T09:19:01.6870371Z fe8a7b64bf98: Download complete 2025-12-04T09:19:01.7734839Z 7680723e9a57: Verifying Checksum 2025-12-04T09:19:01.7735358Z 7680723e9a57: Download complete 2025-12-04T09:19:01.7966019Z 63e5bc7682b8: Verifying Checksum 2025-12-04T09:19:01.7966320Z 63e5bc7682b8: Download complete 2025-12-04T09:19:01.8707062Z 9c5027aeeb4e: Download complete 2025-12-04T09:19:01.8872320Z 9a5652110360: Verifying Checksum 2025-12-04T09:19:01.8872742Z 9a5652110360: Download complete 2025-12-04T09:19:01.9860123Z a86faaa7dbdd: Download complete 2025-12-04T09:19:02.0936897Z fb7848686804: Verifying Checksum 2025-12-04T09:19:02.0937243Z fb7848686804: Download complete 2025-12-04T09:19:02.1817350Z 3541df015cdb: Verifying Checksum 2025-12-04T09:19:02.1817740Z 3541df015cdb: Download complete 2025-12-04T09:19:02.2764035Z 79dc80f426b2: Download complete 2025-12-04T09:19:02.9868880Z 63e5bc7682b8: Pull complete 2025-12-04T09:19:03.0101726Z 0678d56345c9: Pull complete 2025-12-04T09:19:03.0360594Z 375c4427e914: Verifying Checksum 2025-12-04T09:19:03.0361068Z 375c4427e914: Download complete 2025-12-04T09:19:03.0443735Z 4f4fb700ef54: Verifying Checksum 2025-12-04T09:19:03.0444246Z 4f4fb700ef54: Download complete 2025-12-04T09:19:03.1223829Z 549db4d6c618: Verifying Checksum 2025-12-04T09:19:03.1224270Z 549db4d6c618: Download complete 2025-12-04T09:19:03.2083760Z 5c63528cb580: Download complete 2025-12-04T09:19:03.2922913Z 75bd83b989a4: Verifying Checksum 2025-12-04T09:19:03.2923395Z 75bd83b989a4: Download complete 2025-12-04T09:19:03.3910448Z de6e78970f51: Download complete 2025-12-04T09:19:03.4710494Z e13ed7c7e473: Verifying Checksum 2025-12-04T09:19:03.4710846Z e13ed7c7e473: Download complete 2025-12-04T09:19:03.5387774Z 6e2949bcb741: Verifying Checksum 2025-12-04T09:19:03.5388188Z 6e2949bcb741: Download complete 2025-12-04T09:19:03.6248763Z 14d69d9aaec7: Verifying Checksum 2025-12-04T09:19:03.6249110Z 14d69d9aaec7: Download complete 2025-12-04T09:19:03.6943108Z 5c02769dd8e5: Verifying Checksum 2025-12-04T09:19:03.6943519Z 5c02769dd8e5: Download complete 2025-12-04T09:19:04.6114690Z 45f5c9ddfce7: Verifying Checksum 2025-12-04T09:19:04.6115100Z 45f5c9ddfce7: Download complete 2025-12-04T09:19:04.6882554Z 2fa92dc5885e: Verifying Checksum 2025-12-04T09:19:04.6883050Z 2fa92dc5885e: Download complete 2025-12-04T09:19:05.0742216Z 2b85eafbd92a: Verifying Checksum 2025-12-04T09:19:05.0742617Z 2b85eafbd92a: Download complete 2025-12-04T09:19:05.1449599Z ff755a4ddad7: Verifying Checksum 2025-12-04T09:19:05.1449942Z ff755a4ddad7: Download complete 2025-12-04T09:19:05.2318979Z 09eb41bdf42d: Download complete 2025-12-04T09:19:09.8780921Z 11ede4d59e93: Verifying Checksum 2025-12-04T09:19:09.8781305Z 11ede4d59e93: Download complete 2025-12-04T09:19:09.9878449Z 1283cd8f801a: Verifying Checksum 2025-12-04T09:19:10.0864552Z 1283cd8f801a: Download complete 2025-12-04T09:19:10.0864934Z 024fa855425f: Verifying Checksum 2025-12-04T09:19:10.0865352Z 024fa855425f: Download complete 2025-12-04T09:19:10.1589373Z 303e6747a62e: Verifying Checksum 2025-12-04T09:19:10.1589848Z 303e6747a62e: Download complete 2025-12-04T09:19:10.2407635Z 3017cdf4838b: Verifying Checksum 2025-12-04T09:19:10.2407975Z 3017cdf4838b: Download complete 2025-12-04T09:19:10.4972101Z 6b6cd1c358e8: Verifying Checksum 2025-12-04T09:19:10.4972572Z 6b6cd1c358e8: Download complete 2025-12-04T09:19:10.5996467Z b2dd04501124: Download complete 2025-12-04T09:19:10.6782407Z 55adc51fe589: Verifying Checksum 2025-12-04T09:19:10.6782871Z 55adc51fe589: Download complete 2025-12-04T09:19:10.7473559Z a43ca0e4b837: Verifying Checksum 2025-12-04T09:19:10.7474177Z a43ca0e4b837: Download complete 2025-12-04T09:19:10.8438746Z b7212f17fd14: Download complete 2025-12-04T09:19:10.9251343Z 083e42cac090: Verifying Checksum 2025-12-04T09:19:10.9251695Z 083e42cac090: Download complete 2025-12-04T09:19:11.0102311Z 0a00b784a4aa: Verifying Checksum 2025-12-04T09:19:11.0102926Z 0a00b784a4aa: Download complete 2025-12-04T09:19:11.0788071Z c6173c779f7b: Verifying Checksum 2025-12-04T09:19:11.0788629Z c6173c779f7b: Download complete 2025-12-04T09:19:12.5909557Z ed3d1e3387b9: Verifying Checksum 2025-12-04T09:19:12.5910072Z ed3d1e3387b9: Download complete 2025-12-04T09:19:12.6888820Z b29343478586: Verifying Checksum 2025-12-04T09:19:12.6889184Z b29343478586: Download complete 2025-12-04T09:19:14.8260316Z 45f5c9ddfce7: Pull complete 2025-12-04T09:19:15.0056995Z 086b1df51ac1: Pull complete 2025-12-04T09:19:15.1096439Z fe8a7b64bf98: Pull complete 2025-12-04T09:19:15.2948295Z 7680723e9a57: Pull complete 2025-12-04T09:19:15.4486519Z 9c5027aeeb4e: Pull complete 2025-12-04T09:19:15.6083798Z 9a5652110360: Pull complete 2025-12-04T09:19:15.8608469Z c6f0520487fb: Verifying Checksum 2025-12-04T09:19:15.8609217Z c6f0520487fb: Download complete 2025-12-04T09:19:18.3043125Z 375c4427e914: Pull complete 2025-12-04T09:19:18.5111862Z a86faaa7dbdd: Pull complete 2025-12-04T09:19:18.7297612Z fb7848686804: Pull complete 2025-12-04T09:19:18.9514426Z 3541df015cdb: Pull complete 2025-12-04T09:19:19.1684225Z 79dc80f426b2: Pull complete 2025-12-04T09:19:34.1791189Z a13fcc1b90bb: Verifying Checksum 2025-12-04T09:19:34.1791554Z a13fcc1b90bb: Download complete 2025-12-04T09:19:34.3454174Z 2c666d30ed77: Verifying Checksum 2025-12-04T09:19:34.3454590Z 2c666d30ed77: Download complete 2025-12-04T09:19:34.4356303Z 5d8d3a0a98e0: Verifying Checksum 2025-12-04T09:19:34.4356662Z 5d8d3a0a98e0: Download complete 2025-12-04T09:19:34.5917323Z b06bafce9e81: Verifying Checksum 2025-12-04T09:19:34.5917827Z b06bafce9e81: Download complete 2025-12-04T09:19:34.7152933Z 15e0d7e4590d: Download complete 2025-12-04T09:19:34.8092874Z a514bd1add31: Verifying Checksum 2025-12-04T09:19:34.8093377Z a514bd1add31: Download complete 2025-12-04T09:19:34.9129248Z 57b84ee60002: Download complete 2025-12-04T09:19:35.0217885Z b8babeff6d81: Download complete 2025-12-04T09:19:35.1067762Z 83779ddf6a85: Verifying Checksum 2025-12-04T09:19:35.1068241Z 83779ddf6a85: Download complete 2025-12-04T09:19:35.1803007Z 8b7620c0d736: Verifying Checksum 2025-12-04T09:19:35.1803615Z 8b7620c0d736: Download complete 2025-12-04T09:19:35.2613074Z 3bcfa090e4ef: Verifying Checksum 2025-12-04T09:19:35.2613434Z 3bcfa090e4ef: Download complete 2025-12-04T09:19:35.3427298Z eb0504ec4d92: Verifying Checksum 2025-12-04T09:19:35.3430091Z eb0504ec4d92: Download complete 2025-12-04T09:19:35.4450191Z 15d0fec09d7b: Verifying Checksum 2025-12-04T09:19:35.4450683Z 15d0fec09d7b: Download complete 2025-12-04T09:19:35.5435734Z cca81fcc62a9: Verifying Checksum 2025-12-04T09:19:35.5436217Z cca81fcc62a9: Download complete 2025-12-04T09:19:35.6290160Z b0b8f9b5c6ab: Download complete 2025-12-04T09:19:35.7083319Z 0606ca4d47a8: Download complete 2025-12-04T09:19:35.8014679Z 2f80a4e1b3b9: Download complete 2025-12-04T09:19:35.8919853Z 35c916fb1bd0: Verifying Checksum 2025-12-04T09:19:35.8920365Z 35c916fb1bd0: Download complete 2025-12-04T09:19:37.9296161Z 195537b7dafc: Verifying Checksum 2025-12-04T09:19:37.9296637Z 195537b7dafc: Download complete 2025-12-04T09:19:38.0062003Z dc454fd3967e: Download complete 2025-12-04T09:19:38.0865157Z 701b34f115fa: Download complete 2025-12-04T09:19:38.1682919Z 39cefc00ffed: Verifying Checksum 2025-12-04T09:19:38.1683371Z 39cefc00ffed: Download complete 2025-12-04T09:19:38.2558894Z 6ae51eb61a32: Verifying Checksum 2025-12-04T09:19:38.2559361Z 6ae51eb61a32: Download complete 2025-12-04T09:19:38.3330129Z 1fd5341e66df: Verifying Checksum 2025-12-04T09:19:38.3330608Z 1fd5341e66df: Download complete 2025-12-04T09:19:38.5257764Z 72a7c87e35e4: Verifying Checksum 2025-12-04T09:19:38.5258254Z 72a7c87e35e4: Download complete 2025-12-04T09:19:38.6189674Z ec36862ac98e: Verifying Checksum 2025-12-04T09:19:38.6190186Z ec36862ac98e: Download complete 2025-12-04T09:19:39.2319850Z 05ddbf246e8a: Verifying Checksum 2025-12-04T09:19:39.2320387Z 05ddbf246e8a: Download complete 2025-12-04T09:19:46.4925568Z 148171691cd4: Verifying Checksum 2025-12-04T09:19:46.4926030Z 148171691cd4: Download complete 2025-12-04T09:20:22.6560971Z 35041ce524ac: Verifying Checksum 2025-12-04T09:20:22.6561371Z 35041ce524ac: Download complete 2025-12-04T09:20:55.4419527Z a13fcc1b90bb: Pull complete 2025-12-04T09:20:55.5732189Z 4f4fb700ef54: Pull complete 2025-12-04T09:20:55.7089830Z 549db4d6c618: Pull complete 2025-12-04T09:20:55.9637744Z 5c63528cb580: Pull complete 2025-12-04T09:20:56.1795066Z 75bd83b989a4: Pull complete 2025-12-04T09:20:56.4599139Z de6e78970f51: Pull complete 2025-12-04T09:20:56.5758625Z e13ed7c7e473: Pull complete 2025-12-04T09:20:56.7309311Z 6e2949bcb741: Pull complete 2025-12-04T09:20:56.7942613Z 14d69d9aaec7: Pull complete 2025-12-04T09:20:56.9690817Z 5c02769dd8e5: Pull complete 2025-12-04T09:22:29.6516982Z 35041ce524ac: Pull complete 2025-12-04T09:22:29.8770414Z 2fa92dc5885e: Pull complete 2025-12-04T09:22:30.5814703Z 2b85eafbd92a: Pull complete 2025-12-04T09:22:30.7926671Z ff755a4ddad7: Pull complete 2025-12-04T09:22:31.0189848Z 09eb41bdf42d: Pull complete 2025-12-04T09:22:39.6513896Z 11ede4d59e93: Pull complete 2025-12-04T09:22:39.7628862Z 1283cd8f801a: Pull complete 2025-12-04T09:22:39.8827438Z 024fa855425f: Pull complete 2025-12-04T09:22:40.3040781Z 303e6747a62e: Pull complete 2025-12-04T09:22:40.5167700Z 3017cdf4838b: Pull complete 2025-12-04T09:22:40.9490064Z 6b6cd1c358e8: Pull complete 2025-12-04T09:22:41.1530338Z b2dd04501124: Pull complete 2025-12-04T09:22:41.3511650Z 55adc51fe589: Pull complete 2025-12-04T09:22:41.7609199Z a43ca0e4b837: Pull complete 2025-12-04T09:22:41.9831375Z b7212f17fd14: Pull complete 2025-12-04T09:22:42.2082522Z 083e42cac090: Pull complete 2025-12-04T09:22:42.6370080Z 0a00b784a4aa: Pull complete 2025-12-04T09:22:42.8452534Z c6173c779f7b: Pull complete 2025-12-04T09:22:46.6187352Z ed3d1e3387b9: Pull complete 2025-12-04T09:22:46.8399683Z b29343478586: Pull complete 2025-12-04T09:22:48.2753305Z c6f0520487fb: Pull complete 2025-12-04T09:23:49.1051218Z 148171691cd4: Pull complete 2025-12-04T09:23:49.1283852Z 2c666d30ed77: Pull complete 2025-12-04T09:23:49.1508052Z 5d8d3a0a98e0: Pull complete 2025-12-04T09:23:49.1950509Z b06bafce9e81: Pull complete 2025-12-04T09:23:49.2406407Z 15e0d7e4590d: Pull complete 2025-12-04T09:23:49.2630676Z a514bd1add31: Pull complete 2025-12-04T09:23:49.3080642Z 57b84ee60002: Pull complete 2025-12-04T09:23:49.3529533Z b8babeff6d81: Pull complete 2025-12-04T09:23:49.3748269Z 83779ddf6a85: Pull complete 2025-12-04T09:23:49.4205118Z 8b7620c0d736: Pull complete 2025-12-04T09:23:49.4645232Z 3bcfa090e4ef: Pull complete 2025-12-04T09:23:49.4873194Z eb0504ec4d92: Pull complete 2025-12-04T09:23:49.5308197Z 15d0fec09d7b: Pull complete 2025-12-04T09:23:49.5547729Z cca81fcc62a9: Pull complete 2025-12-04T09:23:49.5978823Z b0b8f9b5c6ab: Pull complete 2025-12-04T09:23:49.6202446Z 0606ca4d47a8: Pull complete 2025-12-04T09:23:49.6642097Z 2f80a4e1b3b9: Pull complete 2025-12-04T09:23:49.6858117Z 35c916fb1bd0: Pull complete 2025-12-04T09:23:56.6704684Z 195537b7dafc: Pull complete 2025-12-04T09:23:56.7065148Z dc454fd3967e: Pull complete 2025-12-04T09:23:56.7434052Z 701b34f115fa: Pull complete 2025-12-04T09:23:56.7808316Z 39cefc00ffed: Pull complete 2025-12-04T09:23:56.8161471Z 6ae51eb61a32: Pull complete 2025-12-04T09:23:56.8465265Z 1fd5341e66df: Pull complete 2025-12-04T09:23:58.5781614Z 72a7c87e35e4: Pull complete 2025-12-04T09:23:58.7220024Z ec36862ac98e: Pull complete 2025-12-04T09:24:00.4203010Z 05ddbf246e8a: Pull complete 2025-12-04T09:24:00.6386680Z Digest: sha256:ba21003510dba4bdeed83df81a56fa468e0ee1b612a9445ae1f402a280804f97 2025-12-04T09:24:00.6529660Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:24:00.6589224Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:24:00.6668118Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T09:24:00.6669050Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T09:24:00.6681045Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:24:00.6681413Z env: 2025-12-04T09:24:00.6681638Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:24:00.6681889Z ##[endgroup] 2025-12-04T09:24:00.6867177Z ##[group]Run pytorch/test-infra/.github/actions/setup-nvidia@main 2025-12-04T09:24:00.6867590Z with: 2025-12-04T09:24:00.6867819Z driver-version: 580.82.07 2025-12-04T09:24:00.6868086Z env: 2025-12-04T09:24:00.6868303Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:24:00.6868560Z ##[endgroup] 2025-12-04T09:24:00.6917044Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T09:24:00.6918123Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T09:24:00.6927906Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:24:00.6928267Z env: 2025-12-04T09:24:00.6928481Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:24:00.6928732Z ##[endgroup] 2025-12-04T09:24:00.6991576Z ##[group]Run set -euo pipefail 2025-12-04T09:24:00.6991922Z set -euo pipefail 2025-12-04T09:24:00.6992234Z  2025-12-04T09:24:00.6992453Z has_gpu=false 2025-12-04T09:24:00.6992708Z devices="" 2025-12-04T09:24:00.6992951Z  2025-12-04T09:24:00.6993234Z if command -v nvidia-smi >/dev/null 2>&1; then 2025-12-04T09:24:00.6993693Z  if nvidia-smi -L >/tmp/nvidia_devices 2>/dev/null; then 2025-12-04T09:24:00.6994080Z  has_gpu=true 2025-12-04T09:24:00.6994397Z  devices=$(cat /tmp/nvidia_devices) 2025-12-04T09:24:00.6994713Z  fi 2025-12-04T09:24:00.6994941Z fi 2025-12-04T09:24:00.6995157Z  2025-12-04T09:24:00.6995389Z if [ "$has_gpu" = false ]; then 2025-12-04T09:24:00.6995796Z  if ls /dev/nvidia* >/tmp/nvidia_devices 2>/dev/null; then 2025-12-04T09:24:00.6996188Z  has_gpu=true 2025-12-04T09:24:00.6996493Z  devices=$(cat /tmp/nvidia_devices) 2025-12-04T09:24:00.6996820Z  fi 2025-12-04T09:24:00.6997041Z fi 2025-12-04T09:24:00.6997250Z  2025-12-04T09:24:00.6997567Z if [ "$has_gpu" = false ] && command -v lspci >/dev/null 2>&1; then 2025-12-04T09:24:00.6998074Z  if lspci | grep -i 'nvidia' >/tmp/nvidia_devices 2>/dev/null; then 2025-12-04T09:24:00.6998484Z  has_gpu=true 2025-12-04T09:24:00.6998778Z  devices=$(cat /tmp/nvidia_devices) 2025-12-04T09:24:00.6999088Z  fi 2025-12-04T09:24:00.6999303Z fi 2025-12-04T09:24:00.6999506Z  2025-12-04T09:24:00.6999816Z printf 'HAS_NVIDIA=%s\n' "$has_gpu" >> "$GITHUB_OUTPUT" 2025-12-04T09:24:00.7000350Z printf 'DETECTED_DEVICES<> "$GITHUB_OUTPUT" 2025-12-04T09:24:00.7009494Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:24:00.7009861Z env: 2025-12-04T09:24:00.7010080Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:24:00.7010332Z ##[endgroup] 2025-12-04T09:24:02.4483345Z ##[group]Run if [ "${HAS_NVIDIA}" = "true" ]; then 2025-12-04T09:24:02.4483747Z if [ "${HAS_NVIDIA}" = "true" ]; then 2025-12-04T09:24:02.4484125Z  echo "HAS_NVIDIA_GPU=true" >> "${GITHUB_ENV}" 2025-12-04T09:24:02.4484634Z  echo "GPU_FLAG=--gpus all -e NVIDIA_DRIVER_CAPABILITIES=all" >> "${GITHUB_ENV}" 2025-12-04T09:24:02.4485141Z else 2025-12-04T09:24:02.4485430Z  echo "HAS_NVIDIA_GPU=false" >> "${GITHUB_ENV}" 2025-12-04T09:24:02.4485779Z fi 2025-12-04T09:24:02.4494747Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:24:02.4495111Z env: 2025-12-04T09:24:02.4495328Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:24:02.4495588Z HAS_NVIDIA: true 2025-12-04T09:24:02.4495810Z ##[endgroup] 2025-12-04T09:24:02.4609160Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2025-12-04T09:24:02.4609567Z with: 2025-12-04T09:24:02.4609774Z timeout_minutes: 10 2025-12-04T09:24:02.4610023Z max_attempts: 3 2025-12-04T09:24:02.4635451Z command: # Is it disgusting to have a full shell script here in this github action? Sure # But is it the best way to make it so that this action relies on nothing else? Absolutely set -eou pipefail DISTRIBUTION=$(. /etc/os-release;echo $ID$VERSION_ID) DRIVER_FN="NVIDIA-Linux-x86_64-${DRIVER_VERSION}.run" install_nvidia_docker2_amzn2() { ( set -x # Needed for yum-config-manager sudo yum install -y yum-utils if [[ "${DISTRIBUTION}" == "amzn2023" ]] ; then YUM_REPO_URL="https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo" else # Amazon Linux 2 YUM_REPO_URL="https://nvidia.github.io/nvidia-docker/${DISTRIBUTION}/nvidia-docker.repo" fi sudo yum-config-manager --add-repo "${YUM_REPO_URL}" sudo yum install -y \ nvidia-container-toolkit-1.17.8 \ libnvidia-container-tools-1.17.8 \ libnvidia-container1-1.17.8 \ nvidia-container-toolkit-base-1.17.8 sudo systemctl restart docker ) } install_nvidia_docker2_ubuntu20() { ( set -x # Install nvidia-driver package if not installed status="$(dpkg-query -W --showformat='${db:Status-Status}' nvidia-docker2 2>&1)" if [ ! $? = 0 ] || [ ! "$status" = installed ]; then sudo apt-get install -y nvidia-container-toolkit-1.17.8 sudo systemctl restart docker fi ) } pre_install_nvidia_driver_amzn2() { ( # Purge any nvidia driver installed from RHEL repo sudo yum remove -y nvidia-driver-latest-dkms ) } install_nvidia_driver_common() { ( # Try to gather more information about the runner and its existing NVIDIA driver if any echo "Before installing NVIDIA driver" lspci lsmod modinfo nvidia || true HAS_NVIDIA_DRIVER=0 # Check if NVIDIA driver has already been installed if [ -x "$(command -v nvidia-smi)" ]; then set +e # The driver exists, check its version next. Also check only the first GPU if there are more than one of them # so that the same driver version is not print over multiple lines INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then echo "Failed to get NVIDIA driver version ($INSTALLED_DRIVER_VERSION). Continuing" elif [ "$INSTALLED_DRIVER_VERSION" != "$DRIVER_VERSION" ]; then echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has been installed, but we expect to have $DRIVER_VERSION instead. Continuing" # Turn off persistent mode so that the installation script can unload the kernel module sudo killall nvidia-persistenced || true else HAS_NVIDIA_DRIVER=1 echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has already been installed. Skipping NVIDIA driver installation" fi set -e fi if [ "$HAS_NVIDIA_DRIVER" -eq 0 ]; then # CAUTION: this may need to be updated in future if [ "${DISTRIBUTION}" != ubuntu20.04 ]; then sudo yum groupinstall -y "Development Tools" # ensure our kernel install is the same as our underlying kernel, # groupinstall "Development Tools" has a habit of mismatching kernel headers sudo yum install -y "kernel-devel-uname-r == $(uname -r)" sudo modprobe backlight fi sudo curl -fsL -o /tmp/nvidia_driver "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN" set +e sudo /bin/bash /tmp/nvidia_driver -s --no-drm NVIDIA_INSTALLATION_STATUS=$? RESET_GPU=0 if [ "$NVIDIA_INSTALLATION_STATUS" -ne 0 ]; then sudo cat /var/log/nvidia-installer.log # Fail to install NVIDIA driver, try to reset the GPU RESET_GPU=1 elif [ -x "$(command -v nvidia-smi)" ]; then # Check again if nvidia-smi works even if the driver installation completes successfully INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then RESET_GPU=1 fi fi if [ "$RESET_GPU" -eq 1 ]; then NVIDIA_DEVICES=$(lspci -D | grep -i NVIDIA | cut -d' ' -f1) # The GPU can get stuck in a failure state if somehow the test crashs the GPU microcode. When this # happens, we'll try to reset all NVIDIA devices https://github.com/pytorch/pytorch/issues/88388 for PCI_ID in $NVIDIA_DEVICES; do DEVICE_ENABLED=$(cat /sys/bus/pci/devices/$PCI_ID/enable) echo "Reseting $PCI_ID (enabled state: $DEVICE_ENABLED)" # This requires sudo permission of course echo "1" | sudo tee /sys/bus/pci/devices/$PCI_ID/reset sleep 1 done fi sudo rm -fv /tmp/nvidia_driver set -e fi ) } post_install_nvidia_driver_common() { ( sudo modprobe nvidia || true echo "After installing NVIDIA driver" lspci lsmod modinfo nvidia || true ( set +e nvidia-smi # NB: Annoyingly, nvidia-smi command returns successfully with return code 0 even in # the case where the driver has already crashed as it still can get the driver version # and some basic information like the bus ID. However, the rest of the information # would be missing (ERR!), for example: # # +-----------------------------------------------------------------------------+ # | NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0 | # |-------------------------------+----------------------+----------------------+ # | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | # | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | # | | | MIG M. | # |===============================+======================+======================| # | 0 ERR! Off | 00000000:00:1E.0 Off | ERR! | # |ERR! ERR! ERR! ERR! / ERR! | 4184MiB / 23028MiB | ERR! Default | # | | | ERR! | # +-------------------------------+----------------------+----------------------+ # # +-----------------------------------------------------------------------------+ # | Processes: | # | GPU GI CI PID Type Process name GPU Memory | # | ID ID Usage | # |=============================================================================| # +-----------------------------------------------------------------------------+ # # This should be reported as a failure instead as it will guarantee to fail when # Docker tries to run with --gpus all # # So, the correct check here is to query one of the missing piece of info like # GPU name, so that the command can fail accordingly nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 NVIDIA_SMI_STATUS=$? # Allowable exit statuses for nvidia-smi, see: https://github.com/NVIDIA/gpu-operator/issues/285 if [ "$NVIDIA_SMI_STATUS" -eq 0 ] || [ "$NVIDIA_SMI_STATUS" -eq 14 ]; then echo "INFO: Ignoring allowed status ${NVIDIA_SMI_STATUS}" else echo "ERROR: nvidia-smi exited with unresolved status ${NVIDIA_SMI_STATUS}" exit ${NVIDIA_SMI_STATUS} fi set -e ) ) } install_nvidia_driver_amzn2() { ( set -x pre_install_nvidia_driver_amzn2 install_nvidia_driver_common post_install_nvidia_driver_common ) } install_nvidia_driver_ubuntu20() { ( set -x install_nvidia_driver_common post_install_nvidia_driver_common ) } echo "== Installing nvidia driver ${DRIVER_FN} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_driver_amzn2 ;; ubuntu20.04) install_nvidia_driver_ubuntu20 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac # Install container toolkit based on distribution echo "== Installing nvidia container toolkit for ${DISTRIBUTION} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_docker2_amzn2 ;; ubuntu20.04) install_nvidia_docker2_ubuntu20 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac # Fix https://github.com/NVIDIA/nvidia-docker/issues/1648 on runners with # more than one GPUs. This just needs to be run once. The command fails # on subsequent runs and complains that the mode is already on, but that's # ok sudo nvidia-persistenced || true # This should show persistence mode ON nvidia-smi # check if the container-toolkit is correctly installed and CUDA is available inside a container docker run --rm -t --gpus=all public.ecr.aws/docker/library/python:3.13 nvidia-smi 2025-12-04T09:24:02.4660488Z retry_wait_seconds: 10 2025-12-04T09:24:02.4660778Z polling_interval_seconds: 1 2025-12-04T09:24:02.4661056Z warning_on_retry: true 2025-12-04T09:24:02.4661334Z continue_on_error: false 2025-12-04T09:24:02.4661590Z env: 2025-12-04T09:24:02.4661798Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:24:02.4662063Z HAS_NVIDIA_GPU: true 2025-12-04T09:24:02.4662382Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:24:02.4662737Z DRIVER_VERSION: 580.82.07 2025-12-04T09:24:02.4662999Z ##[endgroup] 2025-12-04T09:24:02.6102836Z == Installing nvidia driver NVIDIA-Linux-x86_64-580.82.07.run == 2025-12-04T09:24:02.6103841Z + pre_install_nvidia_driver_amzn2 2025-12-04T09:24:02.6106787Z + sudo yum remove -y nvidia-driver-latest-dkms 2025-12-04T09:24:03.3547853Z No match for argument: nvidia-driver-latest-dkms 2025-12-04T09:24:03.3548264Z No packages marked for removal. 2025-12-04T09:24:03.3614278Z Dependencies resolved. 2025-12-04T09:24:03.3624761Z Nothing to do. 2025-12-04T09:24:03.3625735Z Complete! 2025-12-04T09:24:03.4020451Z + install_nvidia_driver_common 2025-12-04T09:24:03.4025695Z + echo 'Before installing NVIDIA driver' 2025-12-04T09:24:03.4027369Z + lspci 2025-12-04T09:24:03.4030017Z Before installing NVIDIA driver 2025-12-04T09:24:03.4300369Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] 2025-12-04T09:24:03.4300921Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2025-12-04T09:24:03.4301558Z 00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08) 2025-12-04T09:24:03.4302117Z 00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111 2025-12-04T09:24:03.4302779Z 00:04.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe EBS Controller 2025-12-04T09:24:03.4303702Z 00:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2025-12-04T09:24:03.4304267Z 00:1e.0 3D controller: NVIDIA Corporation GA102GL [A10G] (rev a1) 2025-12-04T09:24:03.4304787Z 00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller 2025-12-04T09:24:03.4305214Z + lsmod 2025-12-04T09:24:03.4356899Z Module Size Used by 2025-12-04T09:24:03.4357620Z nvidia_uvm 1925120 0 2025-12-04T09:24:03.4358034Z nvidia 14286848 1 nvidia_uvm 2025-12-04T09:24:03.4358423Z drm 602112 1 nvidia 2025-12-04T09:24:03.4358753Z drm_panel_orientation_quirks 32768 1 drm 2025-12-04T09:24:03.4359087Z backlight 24576 1 drm 2025-12-04T09:24:03.4359420Z i2c_core 110592 2 nvidia,drm 2025-12-04T09:24:03.4359841Z xt_conntrack 16384 1 2025-12-04T09:24:03.4360425Z nft_chain_nat 16384 3 2025-12-04T09:24:03.4360807Z xt_MASQUERADE 20480 1 2025-12-04T09:24:03.4361143Z nf_nat 57344 2 nft_chain_nat,xt_MASQUERADE 2025-12-04T09:24:03.4361552Z nf_conntrack_netlink 57344 0 2025-12-04T09:24:03.4361989Z nf_conntrack 184320 4 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE 2025-12-04T09:24:03.4362450Z nf_defrag_ipv6 24576 1 nf_conntrack 2025-12-04T09:24:03.4362786Z nf_defrag_ipv4 16384 1 nf_conntrack 2025-12-04T09:24:03.4363107Z xfrm_user 57344 1 2025-12-04T09:24:03.4363393Z xfrm_algo 16384 1 xfrm_user 2025-12-04T09:24:03.4363695Z xt_addrtype 16384 2 2025-12-04T09:24:03.4363970Z nft_compat 20480 4 2025-12-04T09:24:03.4364297Z nf_tables 311296 57 nft_compat,nft_chain_nat 2025-12-04T09:24:03.4364733Z nfnetlink 20480 4 nft_compat,nf_conntrack_netlink,nf_tables 2025-12-04T09:24:03.4365138Z br_netfilter 36864 0 2025-12-04T09:24:03.4365430Z bridge 323584 1 br_netfilter 2025-12-04T09:24:03.4365741Z stp 16384 1 bridge 2025-12-04T09:24:03.4366041Z llc 16384 2 bridge,stp 2025-12-04T09:24:03.4366335Z overlay 167936 0 2025-12-04T09:24:03.4366592Z tls 139264 0 2025-12-04T09:24:03.4366854Z nls_ascii 16384 1 2025-12-04T09:24:03.4367119Z nls_cp437 20480 1 2025-12-04T09:24:03.4367379Z vfat 24576 1 2025-12-04T09:24:03.4367641Z fat 86016 1 vfat 2025-12-04T09:24:03.4367921Z sunrpc 700416 1 2025-12-04T09:24:03.4368193Z ghash_clmulni_intel 16384 0 2025-12-04T09:24:03.4368457Z i8042 45056 0 2025-12-04T09:24:03.4368723Z serio 28672 3 i8042 2025-12-04T09:24:03.4368998Z ena 184320 0 2025-12-04T09:24:03.4369253Z button 24576 0 2025-12-04T09:24:03.4369523Z sch_fq_codel 20480 17 2025-12-04T09:24:03.4369791Z fuse 184320 1 2025-12-04T09:24:03.4370042Z loop 36864 0 2025-12-04T09:24:03.4370298Z dm_mod 188416 0 2025-12-04T09:24:03.4370560Z configfs 57344 1 2025-12-04T09:24:03.4370826Z dmi_sysfs 20480 0 2025-12-04T09:24:03.4371095Z crc32_pclmul 16384 0 2025-12-04T09:24:03.4371365Z crc32c_intel 24576 0 2025-12-04T09:24:03.4371625Z efivarfs 24576 1 2025-12-04T09:24:03.4371900Z + modinfo nvidia 2025-12-04T09:24:03.4379968Z filename: /lib/modules/6.1.150-174.273.amzn2023.x86_64/kernel/drivers/video/nvidia.ko 2025-12-04T09:24:03.4380657Z import_ns: DMA_BUF 2025-12-04T09:24:03.4381012Z alias: char-major-195-* 2025-12-04T09:24:03.4381397Z version: 580.82.07 2025-12-04T09:24:03.4381693Z supported: external 2025-12-04T09:24:03.4381985Z license: Dual MIT/GPL 2025-12-04T09:24:03.4382389Z firmware: nvidia/580.82.07/gsp_tu10x.bin 2025-12-04T09:24:03.4382895Z firmware: nvidia/580.82.07/gsp_ga10x.bin 2025-12-04T09:24:03.4383474Z srcversion: BA7240A71DCF7DC6FE88C1D 2025-12-04T09:24:03.4383962Z alias: of:N*T*Cnvidia,tegra264-displayC* 2025-12-04T09:24:03.4384428Z alias: of:N*T*Cnvidia,tegra264-display 2025-12-04T09:24:03.4384798Z alias: of:N*T*Cnvidia,tegra234-displayC* 2025-12-04T09:24:03.4385162Z alias: of:N*T*Cnvidia,tegra234-display 2025-12-04T09:24:03.4385521Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2025-12-04T09:24:03.4386023Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2025-12-04T09:24:03.4386374Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2025-12-04T09:24:03.4386701Z depends: i2c-core,drm 2025-12-04T09:24:03.4386992Z retpoline: Y 2025-12-04T09:24:03.4387308Z name: nvidia 2025-12-04T09:24:03.4387728Z vermagic: 6.1.150-174.273.amzn2023.x86_64 SMP preempt mod_unload modversions 2025-12-04T09:24:03.4388226Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2025-12-04T09:24:03.4388796Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2025-12-04T09:24:03.4389239Z parm: NVreg_ResmanDebugLevel:int 2025-12-04T09:24:03.4389670Z parm: NVreg_RmLogonRC:int 2025-12-04T09:24:03.4390106Z parm: NVreg_ModifyDeviceFiles:int 2025-12-04T09:24:03.4390566Z parm: NVreg_DeviceFileUID:int 2025-12-04T09:24:03.4391033Z parm: NVreg_DeviceFileGID:int 2025-12-04T09:24:03.4391556Z parm: NVreg_DeviceFileMode:int 2025-12-04T09:24:03.4392091Z parm: NVreg_InitializeSystemMemoryAllocations:int 2025-12-04T09:24:03.4392652Z parm: NVreg_UsePageAttributeTable:int 2025-12-04T09:24:03.4393138Z parm: NVreg_EnablePCIeGen3:int 2025-12-04T09:24:03.4393504Z parm: NVreg_EnableMSI:int 2025-12-04T09:24:03.4393829Z parm: NVreg_EnableStreamMemOPs:int 2025-12-04T09:24:03.4394210Z parm: NVreg_RestrictProfilingToAdminUsers:int 2025-12-04T09:24:03.4394639Z parm: NVreg_PreserveVideoMemoryAllocations:int 2025-12-04T09:24:03.4395034Z parm: NVreg_EnableS0ixPowerManagement:int 2025-12-04T09:24:03.4395466Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2025-12-04T09:24:03.4395898Z parm: NVreg_DynamicPowerManagement:int 2025-12-04T09:24:03.4396335Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2025-12-04T09:24:03.4396767Z parm: NVreg_EnableGpuFirmware:int 2025-12-04T09:24:03.4397128Z parm: NVreg_EnableGpuFirmwareLogs:int 2025-12-04T09:24:03.4397509Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2025-12-04T09:24:03.4397904Z parm: NVreg_EnableUserNUMAManagement:int 2025-12-04T09:24:03.4398265Z parm: NVreg_MemoryPoolSize:int 2025-12-04T09:24:03.4398603Z parm: NVreg_KMallocHeapMaxSize:int 2025-12-04T09:24:03.4398948Z parm: NVreg_VMallocHeapMaxSize:int 2025-12-04T09:24:03.4399289Z parm: NVreg_IgnoreMMIOCheck:int 2025-12-04T09:24:03.4399619Z parm: NVreg_NvLinkDisable:int 2025-12-04T09:24:03.4399979Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2025-12-04T09:24:03.4400366Z parm: NVreg_RegisterPCIDriver:int 2025-12-04T09:24:03.4400741Z parm: NVreg_RegisterPlatformDeviceDriver:int 2025-12-04T09:24:03.4401120Z parm: NVreg_EnableResizableBar:int 2025-12-04T09:24:03.4401476Z parm: NVreg_EnableDbgBreakpoint:int 2025-12-04T09:24:03.4401840Z parm: NVreg_EnableNonblockingOpen:int 2025-12-04T09:24:03.4402214Z parm: NVreg_CoherentGPUMemoryMode:charp 2025-12-04T09:24:03.4402581Z parm: NVreg_RegistryDwords:charp 2025-12-04T09:24:03.4402940Z parm: NVreg_RegistryDwordsPerDevice:charp 2025-12-04T09:24:03.4403291Z parm: NVreg_RmMsg:charp 2025-12-04T09:24:03.4403596Z parm: NVreg_GpuBlacklist:charp 2025-12-04T09:24:03.4403936Z parm: NVreg_TemporaryFilePath:charp 2025-12-04T09:24:03.4404288Z parm: NVreg_ExcludedGpus:charp 2025-12-04T09:24:03.4404619Z parm: NVreg_DmaRemapPeerMmio:int 2025-12-04T09:24:03.4404965Z parm: NVreg_RmNvlinkBandwidth:charp 2025-12-04T09:24:03.4405340Z parm: NVreg_RmNvlinkBandwidthLinkCount:int 2025-12-04T09:24:03.4405708Z parm: NVreg_ImexChannelCount:int 2025-12-04T09:24:03.4406050Z parm: NVreg_CreateImexChannel0:int 2025-12-04T09:24:03.4406419Z parm: NVreg_GrdmaPciTopoCheckOverride:int 2025-12-04T09:24:03.4406780Z parm: rm_firmware_active:charp 2025-12-04T09:24:03.4407224Z + HAS_NVIDIA_DRIVER=0 2025-12-04T09:24:03.4407491Z ++ command -v nvidia-smi 2025-12-04T09:24:03.4407762Z + '[' -x /usr/bin/nvidia-smi ']' 2025-12-04T09:24:03.4408028Z + set +e 2025-12-04T09:24:03.4408359Z ++ nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0 2025-12-04T09:24:05.1684618Z + INSTALLED_DRIVER_VERSION=580.82.07 2025-12-04T09:24:05.1684986Z + NVIDIA_SMI_STATUS=0 2025-12-04T09:24:05.1685558Z + '[' 0 -ne 0 ']' 2025-12-04T09:24:05.1685802Z + '[' 580.82.07 '!=' 580.82.07 ']' 2025-12-04T09:24:05.1686307Z + HAS_NVIDIA_DRIVER=1 2025-12-04T09:24:05.1686789Z + echo 'NVIDIA driver (580.82.07) has already been installed. Skipping NVIDIA driver installation' 2025-12-04T09:24:05.1687267Z + set -e 2025-12-04T09:24:05.1687473Z + '[' 1 -eq 0 ']' 2025-12-04T09:24:05.1687889Z NVIDIA driver (580.82.07) has already been installed. Skipping NVIDIA driver installation 2025-12-04T09:24:05.1688679Z + post_install_nvidia_driver_common 2025-12-04T09:24:05.1692192Z + sudo modprobe nvidia 2025-12-04T09:24:05.2943405Z + echo 'After installing NVIDIA driver' 2025-12-04T09:24:05.2943747Z + lspci 2025-12-04T09:24:05.2943981Z After installing NVIDIA driver 2025-12-04T09:24:05.3066506Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] 2025-12-04T09:24:05.3067587Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2025-12-04T09:24:05.3068717Z 00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08) 2025-12-04T09:24:05.3069823Z 00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111 2025-12-04T09:24:05.3070819Z 00:04.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe EBS Controller 2025-12-04T09:24:05.3071908Z 00:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2025-12-04T09:24:05.3072523Z 00:1e.0 3D controller: NVIDIA Corporation GA102GL [A10G] (rev a1) 2025-12-04T09:24:05.3073020Z 00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller 2025-12-04T09:24:05.3073439Z + lsmod 2025-12-04T09:24:05.3105671Z Module Size Used by 2025-12-04T09:24:05.3105991Z nvidia_uvm 1925120 0 2025-12-04T09:24:05.3106284Z nvidia 14286848 1 nvidia_uvm 2025-12-04T09:24:05.3106583Z drm 602112 1 nvidia 2025-12-04T09:24:05.3106909Z drm_panel_orientation_quirks 32768 1 drm 2025-12-04T09:24:05.3107240Z backlight 24576 1 drm 2025-12-04T09:24:05.3107536Z i2c_core 110592 2 nvidia,drm 2025-12-04T09:24:05.3107836Z xt_conntrack 16384 1 2025-12-04T09:24:05.3108127Z nft_chain_nat 16384 3 2025-12-04T09:24:05.3108400Z xt_MASQUERADE 20480 1 2025-12-04T09:24:05.3108708Z nf_nat 57344 2 nft_chain_nat,xt_MASQUERADE 2025-12-04T09:24:05.3109063Z nf_conntrack_netlink 57344 0 2025-12-04T09:24:05.3109481Z nf_conntrack 184320 4 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE 2025-12-04T09:24:05.3109938Z nf_defrag_ipv6 24576 1 nf_conntrack 2025-12-04T09:24:05.3110267Z nf_defrag_ipv4 16384 1 nf_conntrack 2025-12-04T09:24:05.3110588Z xfrm_user 57344 1 2025-12-04T09:24:05.3110866Z xfrm_algo 16384 1 xfrm_user 2025-12-04T09:24:05.3111161Z xt_addrtype 16384 2 2025-12-04T09:24:05.3111428Z nft_compat 20480 4 2025-12-04T09:24:05.3111752Z nf_tables 311296 57 nft_compat,nft_chain_nat 2025-12-04T09:24:05.3112184Z nfnetlink 20480 4 nft_compat,nf_conntrack_netlink,nf_tables 2025-12-04T09:24:05.3112587Z br_netfilter 36864 0 2025-12-04T09:24:05.3112882Z bridge 323584 1 br_netfilter 2025-12-04T09:24:05.3113194Z stp 16384 1 bridge 2025-12-04T09:24:05.3113487Z llc 16384 2 bridge,stp 2025-12-04T09:24:05.3113785Z overlay 167936 0 2025-12-04T09:24:05.3114045Z tls 139264 0 2025-12-04T09:24:05.3114298Z nls_ascii 16384 1 2025-12-04T09:24:05.3114561Z nls_cp437 20480 1 2025-12-04T09:24:05.3115122Z vfat 24576 1 2025-12-04T09:24:05.3115385Z fat 86016 1 vfat 2025-12-04T09:24:05.3115662Z sunrpc 700416 1 2025-12-04T09:24:05.3115932Z ghash_clmulni_intel 16384 0 2025-12-04T09:24:05.3116191Z i8042 45056 0 2025-12-04T09:24:05.3116455Z serio 28672 3 i8042 2025-12-04T09:24:05.3116737Z ena 184320 0 2025-12-04T09:24:05.3117150Z button 24576 0 2025-12-04T09:24:05.3117416Z sch_fq_codel 20480 17 2025-12-04T09:24:05.3117685Z fuse 184320 1 2025-12-04T09:24:05.3117946Z loop 36864 0 2025-12-04T09:24:05.3118199Z dm_mod 188416 0 2025-12-04T09:24:05.3118462Z configfs 57344 1 2025-12-04T09:24:05.3118721Z dmi_sysfs 20480 0 2025-12-04T09:24:05.3118979Z crc32_pclmul 16384 0 2025-12-04T09:24:05.3119245Z crc32c_intel 24576 0 2025-12-04T09:24:05.3119514Z efivarfs 24576 1 2025-12-04T09:24:05.3119766Z + modinfo nvidia 2025-12-04T09:24:05.3125416Z filename: /lib/modules/6.1.150-174.273.amzn2023.x86_64/kernel/drivers/video/nvidia.ko 2025-12-04T09:24:05.3125914Z import_ns: DMA_BUF 2025-12-04T09:24:05.3126181Z alias: char-major-195-* 2025-12-04T09:24:05.3126527Z version: 580.82.07 2025-12-04T09:24:05.3126874Z supported: external 2025-12-04T09:24:05.3127253Z license: Dual MIT/GPL 2025-12-04T09:24:05.3127648Z firmware: nvidia/580.82.07/gsp_tu10x.bin 2025-12-04T09:24:05.3128013Z firmware: nvidia/580.82.07/gsp_ga10x.bin 2025-12-04T09:24:05.3128352Z srcversion: BA7240A71DCF7DC6FE88C1D 2025-12-04T09:24:05.3128710Z alias: of:N*T*Cnvidia,tegra264-displayC* 2025-12-04T09:24:05.3129078Z alias: of:N*T*Cnvidia,tegra264-display 2025-12-04T09:24:05.3129442Z alias: of:N*T*Cnvidia,tegra234-displayC* 2025-12-04T09:24:05.3129814Z alias: of:N*T*Cnvidia,tegra234-display 2025-12-04T09:24:05.3130171Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2025-12-04T09:24:05.3130523Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2025-12-04T09:24:05.3130870Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2025-12-04T09:24:05.3131187Z depends: i2c-core,drm 2025-12-04T09:24:05.3131459Z retpoline: Y 2025-12-04T09:24:05.3131695Z name: nvidia 2025-12-04T09:24:05.3132067Z vermagic: 6.1.150-174.273.amzn2023.x86_64 SMP preempt mod_unload modversions 2025-12-04T09:24:05.3132562Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2025-12-04T09:24:05.3133030Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2025-12-04T09:24:05.3133474Z parm: NVreg_ResmanDebugLevel:int 2025-12-04T09:24:05.3133794Z parm: NVreg_RmLogonRC:int 2025-12-04T09:24:05.3134113Z parm: NVreg_ModifyDeviceFiles:int 2025-12-04T09:24:05.3134448Z parm: NVreg_DeviceFileUID:int 2025-12-04T09:24:05.3134759Z parm: NVreg_DeviceFileGID:int 2025-12-04T09:24:05.3135087Z parm: NVreg_DeviceFileMode:int 2025-12-04T09:24:05.3135471Z parm: NVreg_InitializeSystemMemoryAllocations:int 2025-12-04T09:24:05.3135873Z parm: NVreg_UsePageAttributeTable:int 2025-12-04T09:24:05.3136224Z parm: NVreg_EnablePCIeGen3:int 2025-12-04T09:24:05.3136550Z parm: NVreg_EnableMSI:int 2025-12-04T09:24:05.3136867Z parm: NVreg_EnableStreamMemOPs:int 2025-12-04T09:24:05.3137257Z parm: NVreg_RestrictProfilingToAdminUsers:int 2025-12-04T09:24:05.3137673Z parm: NVreg_PreserveVideoMemoryAllocations:int 2025-12-04T09:24:05.3138073Z parm: NVreg_EnableS0ixPowerManagement:int 2025-12-04T09:24:05.3138502Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2025-12-04T09:24:05.3138936Z parm: NVreg_DynamicPowerManagement:int 2025-12-04T09:24:05.3139380Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2025-12-04T09:24:05.3140082Z parm: NVreg_EnableGpuFirmware:int 2025-12-04T09:24:05.3140448Z parm: NVreg_EnableGpuFirmwareLogs:int 2025-12-04T09:24:05.3140839Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2025-12-04T09:24:05.3141226Z parm: NVreg_EnableUserNUMAManagement:int 2025-12-04T09:24:05.3141588Z parm: NVreg_MemoryPoolSize:int 2025-12-04T09:24:05.3141925Z parm: NVreg_KMallocHeapMaxSize:int 2025-12-04T09:24:05.3142301Z parm: NVreg_VMallocHeapMaxSize:int 2025-12-04T09:24:05.3142781Z parm: NVreg_IgnoreMMIOCheck:int 2025-12-04T09:24:05.3143108Z parm: NVreg_NvLinkDisable:int 2025-12-04T09:24:05.3143639Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2025-12-04T09:24:05.3144023Z parm: NVreg_RegisterPCIDriver:int 2025-12-04T09:24:05.3144401Z parm: NVreg_RegisterPlatformDeviceDriver:int 2025-12-04T09:24:05.3144781Z parm: NVreg_EnableResizableBar:int 2025-12-04T09:24:05.3145130Z parm: NVreg_EnableDbgBreakpoint:int 2025-12-04T09:24:05.3145500Z parm: NVreg_EnableNonblockingOpen:int 2025-12-04T09:24:05.3145893Z parm: NVreg_CoherentGPUMemoryMode:charp 2025-12-04T09:24:05.3146252Z parm: NVreg_RegistryDwords:charp 2025-12-04T09:24:05.3146606Z parm: NVreg_RegistryDwordsPerDevice:charp 2025-12-04T09:24:05.3146955Z parm: NVreg_RmMsg:charp 2025-12-04T09:24:05.3147261Z parm: NVreg_GpuBlacklist:charp 2025-12-04T09:24:05.3147604Z parm: NVreg_TemporaryFilePath:charp 2025-12-04T09:24:05.3147951Z parm: NVreg_ExcludedGpus:charp 2025-12-04T09:24:05.3148287Z parm: NVreg_DmaRemapPeerMmio:int 2025-12-04T09:24:05.3148628Z parm: NVreg_RmNvlinkBandwidth:charp 2025-12-04T09:24:05.3149003Z parm: NVreg_RmNvlinkBandwidthLinkCount:int 2025-12-04T09:24:05.3149376Z parm: NVreg_ImexChannelCount:int 2025-12-04T09:24:05.3149713Z parm: NVreg_CreateImexChannel0:int 2025-12-04T09:24:05.3150084Z parm: NVreg_GrdmaPciTopoCheckOverride:int 2025-12-04T09:24:05.3150449Z parm: rm_firmware_active:charp 2025-12-04T09:24:05.3150745Z + set +e 2025-12-04T09:24:05.3150946Z + nvidia-smi 2025-12-04T09:24:06.7686143Z Thu Dec 4 09:24:06 2025 2025-12-04T09:24:06.7686961Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:24:06.7687988Z | NVIDIA-SMI 580.82.07 Driver Version: 580.82.07 CUDA Version: 13.0 | 2025-12-04T09:24:06.7689040Z +-----------------------------------------+------------------------+----------------------+ 2025-12-04T09:24:06.7690077Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2025-12-04T09:24:06.7691190Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2025-12-04T09:24:06.7692084Z | | | MIG M. | 2025-12-04T09:24:06.7692814Z |=========================================+========================+======================| 2025-12-04T09:24:06.7772193Z | 0 NVIDIA A10G Off | 00000000:00:1E.0 Off | 0 | 2025-12-04T09:24:06.7772665Z | 0% 26C P0 60W / 300W | 0MiB / 23028MiB | 4% Default | 2025-12-04T09:24:06.7773087Z | | | N/A | 2025-12-04T09:24:06.7773527Z +-----------------------------------------+------------------------+----------------------+ 2025-12-04T09:24:06.7775185Z 2025-12-04T09:24:06.7775580Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:24:06.7776043Z | Processes: | 2025-12-04T09:24:06.7776503Z | GPU GI CI PID Type Process name GPU Memory | 2025-12-04T09:24:06.7776939Z | ID ID Usage | 2025-12-04T09:24:06.7777639Z |=========================================================================================| 2025-12-04T09:24:06.7778750Z | No running processes found | 2025-12-04T09:24:06.7779235Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:24:07.2024504Z + nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 2025-12-04T09:24:08.6557630Z NVIDIA A10G 2025-12-04T09:24:08.9277087Z + NVIDIA_SMI_STATUS=0 2025-12-04T09:24:08.9277374Z + '[' 0 -eq 0 ']' 2025-12-04T09:24:08.9277634Z + echo 'INFO: Ignoring allowed status 0' 2025-12-04T09:24:08.9277938Z + set -e 2025-12-04T09:24:08.9278166Z INFO: Ignoring allowed status 0 2025-12-04T09:24:08.9287927Z == Installing nvidia container toolkit for amzn2023 == 2025-12-04T09:24:08.9291401Z + sudo yum install -y yum-utils 2025-12-04T09:24:09.3630335Z Last metadata expiration check: 0:08:41 ago on Thu Dec 4 09:15:28 2025. 2025-12-04T09:24:09.3901453Z Package dnf-utils-4.3.0-13.amzn2023.0.5.noarch is already installed. 2025-12-04T09:24:09.4484236Z Dependencies resolved. 2025-12-04T09:24:09.4780966Z Nothing to do. 2025-12-04T09:24:09.4781345Z Complete! 2025-12-04T09:24:09.8709818Z + [[ amzn2023 == \a\m\z\n\2\0\2\3 ]] 2025-12-04T09:24:09.8710457Z + YUM_REPO_URL=https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2025-12-04T09:24:09.8711390Z + sudo yum-config-manager --add-repo https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2025-12-04T09:24:10.2044337Z Adding repo from: https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2025-12-04T09:24:10.2567679Z + sudo yum install -y nvidia-container-toolkit-1.17.8 libnvidia-container-tools-1.17.8 libnvidia-container1-1.17.8 nvidia-container-toolkit-base-1.17.8 2025-12-04T09:24:10.8032286Z nvidia-container-toolkit 18 kB/s | 833 B 00:00 2025-12-04T09:24:10.8875540Z Dependencies resolved. 2025-12-04T09:24:10.9163424Z ================================================================================ 2025-12-04T09:24:10.9164548Z Package Arch Version Repository Size 2025-12-04T09:24:10.9165136Z ================================================================================ 2025-12-04T09:24:10.9165481Z Downgrading: 2025-12-04T09:24:10.9165876Z libnvidia-container-tools x86_64 1.17.8-1 nvidia-container-toolkit 40 k 2025-12-04T09:24:10.9166479Z libnvidia-container1 x86_64 1.17.8-1 nvidia-container-toolkit 1.0 M 2025-12-04T09:24:10.9167152Z nvidia-container-toolkit x86_64 1.17.8-1 nvidia-container-toolkit 1.2 M 2025-12-04T09:24:10.9167765Z nvidia-container-toolkit-base x86_64 1.17.8-1 nvidia-container-toolkit 5.8 M 2025-12-04T09:24:10.9168134Z 2025-12-04T09:24:10.9168262Z Transaction Summary 2025-12-04T09:24:10.9168587Z ================================================================================ 2025-12-04T09:24:10.9168923Z Downgrade 4 Packages 2025-12-04T09:24:10.9169082Z 2025-12-04T09:24:10.9169200Z Total download size: 8.0 M 2025-12-04T09:24:10.9169469Z Downloading Packages: 2025-12-04T09:24:10.9549847Z (1/4): libnvidia-container-tools-1.17.8-1.x86_6 1.1 MB/s | 40 kB 00:00 2025-12-04T09:24:10.9924125Z (2/4): libnvidia-container1-1.17.8-1.x86_64.rpm 13 MB/s | 1.0 MB 00:00 2025-12-04T09:24:11.0373263Z (3/4): nvidia-container-toolkit-1.17.8-1.x86_64 10 MB/s | 1.2 MB 00:00 2025-12-04T09:24:11.1842557Z (4/4): nvidia-container-toolkit-base-1.17.8-1.x 25 MB/s | 5.8 MB 00:00 2025-12-04T09:24:11.1851958Z -------------------------------------------------------------------------------- 2025-12-04T09:24:11.1854758Z Total 30 MB/s | 8.0 MB 00:00 2025-12-04T09:24:11.1857324Z Running transaction check 2025-12-04T09:24:11.1974366Z Transaction check succeeded. 2025-12-04T09:24:11.1974920Z Running transaction test 2025-12-04T09:24:11.2478995Z Transaction test succeeded. 2025-12-04T09:24:11.2482006Z Running transaction 2025-12-04T09:24:11.9940723Z Preparing : 1/1 2025-12-04T09:24:12.1135869Z Downgrading : nvidia-container-toolkit-base-1.17.8-1.x86_64 1/8 2025-12-04T09:24:12.1333674Z Downgrading : libnvidia-container1-1.17.8-1.x86_64 2/8 2025-12-04T09:24:12.1916983Z Running scriptlet: libnvidia-container1-1.17.8-1.x86_64 2/8 2025-12-04T09:24:12.3222857Z Downgrading : libnvidia-container-tools-1.17.8-1.x86_64 3/8 2025-12-04T09:24:12.3437041Z Downgrading : nvidia-container-toolkit-1.17.8-1.x86_64 4/8 2025-12-04T09:24:12.4359433Z Running scriptlet: nvidia-container-toolkit-1.17.8-1.x86_64 4/8 2025-12-04T09:24:12.4433910Z Running scriptlet: nvidia-container-toolkit-1.18.1-1.x86_64 5/8 2025-12-04T09:24:12.4434663Z Cleanup : nvidia-container-toolkit-1.18.1-1.x86_64 5/8 2025-12-04T09:24:12.4738703Z Running scriptlet: nvidia-container-toolkit-1.18.1-1.x86_64 5/8 2025-12-04T09:24:12.4811648Z Running scriptlet: libnvidia-container-tools-1.18.1-1.x86_64 6/8 2025-12-04T09:24:12.4812272Z Cleanup : libnvidia-container-tools-1.18.1-1.x86_64 6/8 2025-12-04T09:24:12.5105456Z Running scriptlet: libnvidia-container-tools-1.18.1-1.x86_64 6/8 2025-12-04T09:24:12.5189146Z Running scriptlet: libnvidia-container1-1.18.1-1.x86_64 7/8 2025-12-04T09:24:12.5189819Z Cleanup : libnvidia-container1-1.18.1-1.x86_64 7/8 2025-12-04T09:24:12.5456512Z Running scriptlet: libnvidia-container1-1.18.1-1.x86_64 7/8 2025-12-04T09:24:12.5532527Z Running scriptlet: nvidia-container-toolkit-base-1.18.1-1.x86_64 8/8 2025-12-04T09:24:12.5533545Z Cleanup : nvidia-container-toolkit-base-1.18.1-1.x86_64 8/8 2025-12-04T09:24:12.5788561Z Running scriptlet: nvidia-container-toolkit-base-1.18.1-1.x86_64 8/8 2025-12-04T09:24:12.6426262Z Running scriptlet: nvidia-container-toolkit-1.17.8-1.x86_64 8/8 2025-12-04T09:24:45.4845431Z Running scriptlet: nvidia-container-toolkit-base-1.18.1-1.x86_64 8/8 2025-12-04T09:24:45.4847914Z Verifying : libnvidia-container-tools-1.17.8-1.x86_64 1/8 2025-12-04T09:24:45.4848712Z Verifying : libnvidia-container-tools-1.18.1-1.x86_64 2/8 2025-12-04T09:24:45.4849446Z Verifying : libnvidia-container1-1.17.8-1.x86_64 3/8 2025-12-04T09:24:45.4850157Z Verifying : libnvidia-container1-1.18.1-1.x86_64 4/8 2025-12-04T09:24:45.4851413Z Verifying : nvidia-container-toolkit-1.17.8-1.x86_64 5/8 2025-12-04T09:24:45.4852222Z Verifying : nvidia-container-toolkit-1.18.1-1.x86_64 6/8 2025-12-04T09:24:45.4852812Z Verifying : nvidia-container-toolkit-base-1.17.8-1.x86_64 7/8 2025-12-04T09:24:45.6439585Z Verifying : nvidia-container-toolkit-base-1.18.1-1.x86_64 8/8================================================================================ 2025-12-04T09:24:45.6440179Z WARNING: 2025-12-04T09:24:45.6440446Z A newer release of "Amazon Linux" is available. 2025-12-04T09:24:45.6440688Z 2025-12-04T09:24:45.6440803Z Available Versions: 2025-12-04T09:24:45.6440962Z 2025-12-04T09:24:45.6441058Z Version 2023.9.20250929: 2025-12-04T09:24:45.6441387Z Run the following command to upgrade to 2023.9.20250929: 2025-12-04T09:24:45.6441652Z 2025-12-04T09:24:45.6441790Z dnf upgrade --releasever=2023.9.20250929 2025-12-04T09:24:45.6442010Z 2025-12-04T09:24:45.6442101Z Release notes: 2025-12-04T09:24:45.6442529Z https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes-2023.9.20250929.html 2025-12-04T09:24:45.6442917Z 2025-12-04T09:24:45.6443367Z Version 2023.9.20251014: 2025-12-04T09:24:45.6443704Z Run the following command to upgrade to 2023.9.20251014: 2025-12-04T09:24:45.6443966Z 2025-12-04T09:24:45.6444090Z dnf upgrade --releasever=2023.9.20251014 2025-12-04T09:24:45.6444315Z 2025-12-04T09:24:45.6444406Z Release notes: 2025-12-04T09:24:45.6444820Z https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes-2023.9.20251014.html 2025-12-04T09:24:45.6445362Z 2025-12-04T09:24:45.6445465Z Version 2023.9.20251020: 2025-12-04T09:24:45.6445785Z Run the following command to upgrade to 2023.9.20251020: 2025-12-04T09:24:45.6446058Z 2025-12-04T09:24:45.6446180Z dnf upgrade --releasever=2023.9.20251020 2025-12-04T09:24:45.6446397Z 2025-12-04T09:24:45.6446495Z Release notes: 2025-12-04T09:24:45.6446903Z https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes-2023.9.20251020.html 2025-12-04T09:24:45.6447288Z 2025-12-04T09:24:45.6447379Z Version 2023.9.20251027: 2025-12-04T09:24:45.6447711Z Run the following command to upgrade to 2023.9.20251027: 2025-12-04T09:24:45.6447970Z 2025-12-04T09:24:45.6448100Z dnf upgrade --releasever=2023.9.20251027 2025-12-04T09:24:45.6448320Z 2025-12-04T09:24:45.6448408Z Release notes: 2025-12-04T09:24:45.6448820Z https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes-2023.9.20251027.html 2025-12-04T09:24:45.6449194Z 2025-12-04T09:24:45.6449294Z Version 2023.9.20251105: 2025-12-04T09:24:45.6449621Z Run the following command to upgrade to 2023.9.20251105: 2025-12-04T09:24:45.6449891Z 2025-12-04T09:24:45.6450014Z dnf upgrade --releasever=2023.9.20251105 2025-12-04T09:24:45.6450238Z 2025-12-04T09:24:45.6450328Z Release notes: 2025-12-04T09:24:45.6450742Z https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes-2023.9.20251105.html 2025-12-04T09:24:45.6451116Z 2025-12-04T09:24:45.6451209Z Version 2023.9.20251110: 2025-12-04T09:24:45.6451536Z Run the following command to upgrade to 2023.9.20251110: 2025-12-04T09:24:45.6451798Z 2025-12-04T09:24:45.6451929Z dnf upgrade --releasever=2023.9.20251110 2025-12-04T09:24:45.6452148Z 2025-12-04T09:24:45.6452245Z Release notes: 2025-12-04T09:24:45.6452654Z https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes-2023.9.20251110.html 2025-12-04T09:24:45.6453039Z 2025-12-04T09:24:45.6453130Z Version 2023.9.20251117: 2025-12-04T09:24:45.6453455Z Run the following command to upgrade to 2023.9.20251117: 2025-12-04T09:24:45.6453722Z 2025-12-04T09:24:45.6453846Z dnf upgrade --releasever=2023.9.20251117 2025-12-04T09:24:45.6454072Z 2025-12-04T09:24:45.6454162Z Release notes: 2025-12-04T09:24:45.6454578Z https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes-2023.9.20251117.html 2025-12-04T09:24:45.6454954Z 2025-12-04T09:24:45.6455078Z ================================================================================ 2025-12-04T09:24:45.7031534Z 2025-12-04T09:24:45.7032068Z 2025-12-04T09:24:45.7032393Z Downgraded: 2025-12-04T09:24:45.7032810Z libnvidia-container-tools-1.17.8-1.x86_64 2025-12-04T09:24:45.7033382Z libnvidia-container1-1.17.8-1.x86_64 2025-12-04T09:24:45.7033934Z nvidia-container-toolkit-1.17.8-1.x86_64 2025-12-04T09:24:45.7034523Z nvidia-container-toolkit-base-1.17.8-1.x86_64 2025-12-04T09:24:45.7034880Z 2025-12-04T09:24:45.7034976Z Complete! 2025-12-04T09:24:45.7784414Z + sudo systemctl restart docker 2025-12-04T09:24:52.2500455Z Thu Dec 4 09:24:52 2025 2025-12-04T09:24:52.2501538Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:24:52.2502717Z | NVIDIA-SMI 580.82.07 Driver Version: 580.82.07 CUDA Version: 13.0 | 2025-12-04T09:24:52.2503842Z +-----------------------------------------+------------------------+----------------------+ 2025-12-04T09:24:52.2505293Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2025-12-04T09:24:52.2505902Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2025-12-04T09:24:52.2506356Z | | | MIG M. | 2025-12-04T09:24:52.2506709Z |=========================================+========================+======================| 2025-12-04T09:24:52.2596286Z | 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 | 2025-12-04T09:24:52.2596929Z | 0% 26C P0 55W / 300W | 0MiB / 23028MiB | 4% Default | 2025-12-04T09:24:52.2597485Z | | | N/A | 2025-12-04T09:24:52.2598061Z +-----------------------------------------+------------------------+----------------------+ 2025-12-04T09:24:52.2598476Z 2025-12-04T09:24:52.2598700Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:24:52.2599157Z | Processes: | 2025-12-04T09:24:52.2599650Z | GPU GI CI PID Type Process name GPU Memory | 2025-12-04T09:24:52.2600091Z | ID ID Usage | 2025-12-04T09:24:52.2600470Z |=========================================================================================| 2025-12-04T09:24:52.2601381Z | No running processes found | 2025-12-04T09:24:52.2602051Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:24:52.4325140Z Unable to find image 'public.ecr.aws/docker/library/python:3.13' locally 2025-12-04T09:24:52.6359196Z 3.13: Pulling from docker/library/python 2025-12-04T09:24:52.7132419Z 53c88f1dfeb7: Pulling fs layer 2025-12-04T09:24:52.7132884Z eae668646f44: Pulling fs layer 2025-12-04T09:24:52.7133286Z ff2e6e687b6c: Pulling fs layer 2025-12-04T09:24:52.7133587Z 7c40a3faff76: Pulling fs layer 2025-12-04T09:24:52.7133865Z 967a3b1c8fef: Pulling fs layer 2025-12-04T09:24:52.7134235Z a64e1a44f22a: Pulling fs layer 2025-12-04T09:24:52.7134530Z 52655f8a5bcc: Pulling fs layer 2025-12-04T09:24:52.7134797Z 967a3b1c8fef: Waiting 2025-12-04T09:24:52.7135108Z a64e1a44f22a: Waiting 2025-12-04T09:24:52.7135359Z 52655f8a5bcc: Waiting 2025-12-04T09:24:52.7135625Z 7c40a3faff76: Waiting 2025-12-04T09:24:52.8376331Z eae668646f44: Verifying Checksum 2025-12-04T09:24:52.8376699Z eae668646f44: Download complete 2025-12-04T09:24:52.9022067Z 53c88f1dfeb7: Verifying Checksum 2025-12-04T09:24:52.9022481Z 53c88f1dfeb7: Download complete 2025-12-04T09:24:52.9683199Z ff2e6e687b6c: Verifying Checksum 2025-12-04T09:24:52.9683778Z ff2e6e687b6c: Download complete 2025-12-04T09:24:52.9700230Z 967a3b1c8fef: Verifying Checksum 2025-12-04T09:24:52.9700957Z 967a3b1c8fef: Download complete 2025-12-04T09:24:53.0235029Z 52655f8a5bcc: Download complete 2025-12-04T09:24:53.0886274Z a64e1a44f22a: Verifying Checksum 2025-12-04T09:24:53.0887163Z a64e1a44f22a: Download complete 2025-12-04T09:24:53.6641014Z 7c40a3faff76: Verifying Checksum 2025-12-04T09:24:53.6641854Z 7c40a3faff76: Download complete 2025-12-04T09:24:54.7200980Z 53c88f1dfeb7: Pull complete 2025-12-04T09:24:55.4591990Z eae668646f44: Pull complete 2025-12-04T09:24:58.0088660Z ff2e6e687b6c: Pull complete 2025-12-04T09:25:04.8329217Z 7c40a3faff76: Pull complete 2025-12-04T09:25:05.1198803Z 967a3b1c8fef: Pull complete 2025-12-04T09:25:05.8912599Z a64e1a44f22a: Pull complete 2025-12-04T09:25:05.9143964Z 52655f8a5bcc: Pull complete 2025-12-04T09:25:05.9282525Z Digest: sha256:3f986299a7b8b44b0d8cf9bda2b22361ce5c3058ef5d7cb17fb7452506680ab0 2025-12-04T09:25:05.9325811Z Status: Downloaded newer image for public.ecr.aws/docker/library/python:3.13 2025-12-04T09:25:13.3517806Z Thu Dec 4 09:25:13 2025 2025-12-04T09:25:13.3519010Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:25:13.3519687Z | NVIDIA-SMI 580.82.07 Driver Version: 580.82.07 CUDA Version: 13.0 | 2025-12-04T09:25:13.3520261Z +-----------------------------------------+------------------------+----------------------+ 2025-12-04T09:25:13.3520955Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2025-12-04T09:25:13.3521836Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2025-12-04T09:25:13.3522371Z | | | MIG M. | 2025-12-04T09:25:13.3522881Z |=========================================+========================+======================| 2025-12-04T09:25:13.3680312Z | 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 | 2025-12-04T09:25:13.3680885Z | 0% 24C P8 10W / 300W | 0MiB / 23028MiB | 0% Default | 2025-12-04T09:25:13.3681501Z | | | N/A | 2025-12-04T09:25:13.3682009Z +-----------------------------------------+------------------------+----------------------+ 2025-12-04T09:25:13.3686184Z 2025-12-04T09:25:13.3686420Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:25:13.3687098Z | Processes: | 2025-12-04T09:25:13.3687667Z | GPU GI CI PID Type Process name GPU Memory | 2025-12-04T09:25:13.3688237Z | ID ID Usage | 2025-12-04T09:25:13.3688692Z |=========================================================================================| 2025-12-04T09:25:13.3691568Z | No running processes found | 2025-12-04T09:25:13.3692204Z +-----------------------------------------------------------------------------------------+ 2025-12-04T09:25:14.6111364Z Command completed after 1 attempt(s). 2025-12-04T09:25:14.6223533Z Prepare all required actions 2025-12-04T09:25:14.6268835Z ##[group]Run ./.github/actions/get-workflow-job-id 2025-12-04T09:25:14.6269165Z with: 2025-12-04T09:25:14.6269761Z github-token: *** 2025-12-04T09:25:14.6270089Z env: 2025-12-04T09:25:14.6270378Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:14.6270654Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:14.6270972Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:14.6271336Z ##[endgroup] 2025-12-04T09:25:14.6286686Z ##[group]Run set -eux 2025-12-04T09:25:14.6286945Z set -eux 2025-12-04T09:25:14.6287378Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-12-04T09:25:14.6302902Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:25:14.6303368Z env: 2025-12-04T09:25:14.6303592Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:14.6303862Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:14.6304213Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:14.6304708Z GITHUB_TOKEN: *** 2025-12-04T09:25:14.6304941Z ##[endgroup] 2025-12-04T09:25:14.6346585Z + python3 .github/scripts/get_workflow_job_id.py 19922826259 i-04e943c8c4e7268ee 2025-12-04T09:25:16.2954837Z Setting output job-id=57118183206 2025-12-04T09:25:16.2955663Z Setting output job-name=linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T09:25:16.3069137Z ##[group]Run python3 -m pip install psutil==5.9.8 dataclasses_json==0.6.7 nvidia-ml-py==11.525.84 2025-12-04T09:25:16.3069852Z python3 -m pip install psutil==5.9.8 dataclasses_json==0.6.7 nvidia-ml-py==11.525.84 2025-12-04T09:25:16.3070764Z python3 -m tools.stats.monitor --log-interval "$MONITOR_LOG_INTERVAL" --data-collect-interval "$MONITOR_DATA_COLLECT_INTERVAL" > usage_log.txt 2>&1 & 2025-12-04T09:25:16.3071580Z echo "monitor-script-pid=${!}" >> "${GITHUB_OUTPUT}" 2025-12-04T09:25:16.3082960Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:25:16.3083319Z env: 2025-12-04T09:25:16.3083538Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:16.3083808Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:16.3084120Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:16.3084660Z JOB_ID: 57118183206 2025-12-04T09:25:16.3085339Z JOB_NAME: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T09:25:16.3086039Z WORKFLOW_NAME: periodic 2025-12-04T09:25:16.3086308Z WORKFLOW_RUN_ID: 19922826259 2025-12-04T09:25:16.3086586Z MONITOR_LOG_INTERVAL: 5 2025-12-04T09:25:16.3086855Z MONITOR_DATA_COLLECT_INTERVAL: 1 2025-12-04T09:25:16.3087162Z ##[endgroup] 2025-12-04T09:25:16.6007448Z Defaulting to user installation because normal site-packages is not writeable 2025-12-04T09:25:17.0148007Z Collecting psutil==5.9.8 2025-12-04T09:25:17.0312887Z Downloading psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB) 2025-12-04T09:25:17.1126448Z Collecting dataclasses_json==0.6.7 2025-12-04T09:25:17.1159404Z Downloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB) 2025-12-04T09:25:17.1459282Z Collecting nvidia-ml-py==11.525.84 2025-12-04T09:25:17.1494250Z Downloading nvidia_ml_py-11.525.84-py3-none-any.whl (34 kB) 2025-12-04T09:25:17.1827331Z Collecting typing-inspect<1,>=0.4.0 2025-12-04T09:25:17.1862114Z Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB) 2025-12-04T09:25:17.3090090Z Collecting marshmallow<4.0.0,>=3.18.0 2025-12-04T09:25:17.3123020Z Downloading marshmallow-3.26.1-py3-none-any.whl (50 kB) 2025-12-04T09:25:17.3720686Z Collecting packaging>=17.0 2025-12-04T09:25:17.3753809Z Downloading packaging-25.0-py3-none-any.whl (66 kB) 2025-12-04T09:25:17.4316925Z Collecting typing-extensions>=3.7.4 2025-12-04T09:25:17.4350242Z Downloading typing_extensions-4.15.0-py3-none-any.whl (44 kB) 2025-12-04T09:25:17.4554018Z Collecting mypy-extensions>=0.3.0 2025-12-04T09:25:17.4585282Z Downloading mypy_extensions-1.1.0-py3-none-any.whl (5.0 kB) 2025-12-04T09:25:17.5523777Z Installing collected packages: typing-extensions, packaging, mypy-extensions, typing-inspect, marshmallow, psutil, nvidia-ml-py, dataclasses-json 2025-12-04T09:25:17.8294473Z Successfully installed dataclasses-json-0.6.7 marshmallow-3.26.1 mypy-extensions-1.1.0 nvidia-ml-py-11.525.84 packaging-25.0 psutil-5.9.8 typing-extensions-4.15.0 typing-inspect-0.9.0 2025-12-04T09:25:18.0277876Z Prepare all required actions 2025-12-04T09:25:18.0278244Z Getting action download info 2025-12-04T09:25:18.2551311Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-12-04T09:25:18.5696499Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093) 2025-12-04T09:25:18.9931866Z ##[group]Run ./.github/actions/download-build-artifacts 2025-12-04T09:25:18.9932228Z with: 2025-12-04T09:25:18.9932522Z name: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck 2025-12-04T09:25:18.9932890Z s3-bucket: gha-artifacts 2025-12-04T09:25:18.9933139Z env: 2025-12-04T09:25:18.9933363Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:18.9933661Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:18.9934402Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:18.9934773Z ##[endgroup] 2025-12-04T09:25:18.9966288Z ##[group]Run seemethere/download-artifact-s3@v4 2025-12-04T09:25:18.9966656Z with: 2025-12-04T09:25:18.9966938Z name: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck 2025-12-04T09:25:18.9967313Z s3-bucket: gha-artifacts 2025-12-04T09:25:18.9967583Z region: us-east-1 2025-12-04T09:25:18.9967807Z env: 2025-12-04T09:25:18.9968022Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:18.9968290Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:18.9968611Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:18.9968959Z ##[endgroup] 2025-12-04T09:25:19.4835846Z (node:59324) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-12-04T09:25:19.4836322Z 2025-12-04T09:25:19.4836515Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-12-04T09:25:19.4837415Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-12-04T09:25:19.4837981Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-12-04T09:25:19.7682964Z Found 1 objects with prefix pytorch/pytorch/19922826259/linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck/ 2025-12-04T09:25:19.7684514Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2025-12-04T09:25:27.1144851Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2025-12-04T09:25:27.1150712Z Artifact download has finished successfully 2025-12-04T09:25:27.1526164Z ##[group]Run unzip -o artifacts.zip 2025-12-04T09:25:27.1526493Z unzip -o artifacts.zip 2025-12-04T09:25:27.1536792Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:25:27.1537153Z env: 2025-12-04T09:25:27.1537368Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:27.1537634Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:27.1537970Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:27.1538318Z ##[endgroup] 2025-12-04T09:25:27.1615753Z Archive: artifacts.zip 2025-12-04T09:25:27.1617357Z creating: dist/ 2025-12-04T09:25:29.2246091Z inflating: dist/torch-2.10.0a0+gitffd9b0f-cp310-cp310-linux_x86_64.whl 2025-12-04T09:25:29.2381565Z inflating: dist/.ninja_log 2025-12-04T09:25:29.2382228Z creating: build/custom_test_artifacts/ 2025-12-04T09:25:29.2382787Z creating: build/custom_test_artifacts/custom-op-build/ 2025-12-04T09:25:29.2383525Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2025-12-04T09:25:29.2384311Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2025-12-04T09:25:29.2393187Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T09:25:29.2394072Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/ 2025-12-04T09:25:29.2394948Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T09:25:29.2395914Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T09:25:29.2397029Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T09:25:29.2398710Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T09:25:29.2400466Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T09:25:29.2401472Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T09:25:29.2402448Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T09:25:29.2403315Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T09:25:29.2405802Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T09:25:29.2407081Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T09:25:29.2408402Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T09:25:29.2410275Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T09:25:29.2412740Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T09:25:29.2413823Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/ 2025-12-04T09:25:29.2414587Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/ 2025-12-04T09:25:29.2474415Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-12-04T09:25:29.2535373Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-12-04T09:25:29.2536814Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-12-04T09:25:29.2601498Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-12-04T09:25:29.2602773Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-12-04T09:25:29.2604168Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-12-04T09:25:29.2605573Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-12-04T09:25:29.2606923Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-12-04T09:25:29.2608253Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-12-04T09:25:29.2609541Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-12-04T09:25:29.2610820Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-12-04T09:25:29.2612095Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-12-04T09:25:29.2613301Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-12-04T09:25:29.2614514Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-12-04T09:25:29.2615605Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-12-04T09:25:29.2616548Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-12-04T09:25:29.2617602Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.o 2025-12-04T09:25:29.2619075Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-12-04T09:25:29.2694367Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCUDA/a.out 2025-12-04T09:25:29.2695421Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCUDACompiler.cmake 2025-12-04T09:25:29.2770817Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CUDA.bin 2025-12-04T09:25:29.2771682Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2025-12-04T09:25:29.2772382Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2025-12-04T09:25:29.2773156Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2025-12-04T09:25:29.2773924Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2025-12-04T09:25:29.2774755Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2025-12-04T09:25:29.2775695Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2025-12-04T09:25:29.2776589Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2025-12-04T09:25:29.2777422Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2025-12-04T09:25:29.2778210Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2025-12-04T09:25:29.2779147Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2025-12-04T09:25:29.2780189Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2025-12-04T09:25:29.2781011Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2025-12-04T09:25:29.2781848Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2025-12-04T09:25:29.2802990Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2025-12-04T09:25:29.3006409Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2025-12-04T09:25:29.3007313Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2025-12-04T09:25:29.3008164Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2025-12-04T09:25:29.3009121Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2025-12-04T09:25:29.3010048Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2025-12-04T09:25:29.3010910Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2025-12-04T09:25:29.3011705Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2025-12-04T09:25:29.3012688Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2025-12-04T09:25:29.3013512Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2025-12-04T09:25:29.3014449Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2025-12-04T09:25:29.3015244Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2025-12-04T09:25:29.3037236Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2025-12-04T09:25:29.3120213Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2025-12-04T09:25:29.3121264Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T09:25:29.3135226Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2025-12-04T09:25:29.3136182Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2025-12-04T09:25:29.3136828Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2025-12-04T09:25:29.3137450Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2025-12-04T09:25:29.3138086Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2025-12-04T09:25:29.3138678Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2025-12-04T09:25:29.3139231Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2025-12-04T09:25:29.3139781Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2025-12-04T09:25:29.3306477Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2025-12-04T09:25:29.3363879Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2025-12-04T09:25:29.3364593Z creating: build/custom_test_artifacts/jit-hook-build/ 2025-12-04T09:25:29.3365218Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2025-12-04T09:25:29.3365875Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2025-12-04T09:25:29.3373533Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T09:25:29.3374410Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/ 2025-12-04T09:25:29.3375267Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T09:25:29.3376348Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T09:25:29.3377024Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T09:25:29.3378372Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T09:25:29.3380114Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T09:25:29.3381241Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T09:25:29.3382193Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T09:25:29.3383124Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T09:25:29.3385672Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T09:25:29.3387068Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T09:25:29.3388355Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T09:25:29.3390249Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T09:25:29.3392602Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T09:25:29.3393667Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/ 2025-12-04T09:25:29.3394437Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/ 2025-12-04T09:25:29.3454196Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-12-04T09:25:29.3515022Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-12-04T09:25:29.3516266Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-12-04T09:25:29.3581433Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-12-04T09:25:29.3582665Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-12-04T09:25:29.3584080Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-12-04T09:25:29.3585466Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-12-04T09:25:29.3586824Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-12-04T09:25:29.3588055Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-12-04T09:25:29.3589314Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-12-04T09:25:29.3590629Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-12-04T09:25:29.3591906Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-12-04T09:25:29.3593137Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-12-04T09:25:29.3594328Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-12-04T09:25:29.3595496Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-12-04T09:25:29.3596402Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-12-04T09:25:29.3597437Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.o 2025-12-04T09:25:29.3598613Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-12-04T09:25:29.3673352Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCUDA/a.out 2025-12-04T09:25:29.3674425Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCUDACompiler.cmake 2025-12-04T09:25:29.3749876Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CUDA.bin 2025-12-04T09:25:29.3750825Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2025-12-04T09:25:29.3751621Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2025-12-04T09:25:29.3752407Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2025-12-04T09:25:29.3753310Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2025-12-04T09:25:29.3754329Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2025-12-04T09:25:29.3755470Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2025-12-04T09:25:29.3756548Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2025-12-04T09:25:29.3757480Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2025-12-04T09:25:29.3758529Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2025-12-04T09:25:29.3759580Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2025-12-04T09:25:29.3760582Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2025-12-04T09:25:29.3761472Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2025-12-04T09:25:29.3762418Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2025-12-04T09:25:29.3782439Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2025-12-04T09:25:29.3847393Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2025-12-04T09:25:29.3848526Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T09:25:29.3849552Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2025-12-04T09:25:29.3850483Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2025-12-04T09:25:29.3851343Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2025-12-04T09:25:29.3852731Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2025-12-04T09:25:29.3853592Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2025-12-04T09:25:29.3856617Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2025-12-04T09:25:29.3857387Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2025-12-04T09:25:29.3858505Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2025-12-04T09:25:29.3898645Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2025-12-04T09:25:29.3900047Z creating: build/custom_test_artifacts/custom-backend-build/ 2025-12-04T09:25:29.3901263Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2025-12-04T09:25:29.3902471Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2025-12-04T09:25:29.3907912Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T09:25:29.3908862Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/ 2025-12-04T09:25:29.3909802Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T09:25:29.3910727Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T09:25:29.3911452Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T09:25:29.3912809Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T09:25:29.3914577Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T09:25:29.3915662Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T09:25:29.3916711Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T09:25:29.3917615Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T09:25:29.3919886Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T09:25:29.3921390Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T09:25:29.3922727Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T09:25:29.3924849Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T09:25:29.3927403Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T09:25:29.3928524Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/ 2025-12-04T09:25:29.3929279Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/ 2025-12-04T09:25:29.3990016Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-12-04T09:25:29.4050216Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-12-04T09:25:29.4051513Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-12-04T09:25:29.4116120Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-12-04T09:25:29.4117446Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-12-04T09:25:29.4118893Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-12-04T09:25:29.4120380Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-12-04T09:25:29.4121812Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-12-04T09:25:29.4123209Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-12-04T09:25:29.4124834Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-12-04T09:25:29.4126273Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-12-04T09:25:29.4127635Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-12-04T09:25:29.4129137Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-12-04T09:25:29.4130399Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-12-04T09:25:29.4131436Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-12-04T09:25:29.4132358Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-12-04T09:25:29.4133556Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/tmp/a_dlink.o 2025-12-04T09:25:29.4134496Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-12-04T09:25:29.4209398Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCUDA/a.out 2025-12-04T09:25:29.4211407Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCUDACompiler.cmake 2025-12-04T09:25:29.4284725Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CUDA.bin 2025-12-04T09:25:29.4285822Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2025-12-04T09:25:29.4286694Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2025-12-04T09:25:29.4287591Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2025-12-04T09:25:29.4288562Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2025-12-04T09:25:29.4289638Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2025-12-04T09:25:29.4290858Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2025-12-04T09:25:29.4291979Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2025-12-04T09:25:29.4293312Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2025-12-04T09:25:29.4294436Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2025-12-04T09:25:29.4295607Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2025-12-04T09:25:29.4296481Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2025-12-04T09:25:29.4297335Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2025-12-04T09:25:29.4298168Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2025-12-04T09:25:29.4299925Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2025-12-04T09:25:29.4423791Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2025-12-04T09:25:29.4425179Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2025-12-04T09:25:29.4426322Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2025-12-04T09:25:29.4427605Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2025-12-04T09:25:29.4428809Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2025-12-04T09:25:29.4429972Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2025-12-04T09:25:29.4431128Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2025-12-04T09:25:29.4432373Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2025-12-04T09:25:29.4433268Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2025-12-04T09:25:29.4434165Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2025-12-04T09:25:29.4435038Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2025-12-04T09:25:29.4453321Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2025-12-04T09:25:29.4509554Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2025-12-04T09:25:29.4510822Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T09:25:29.4511948Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2025-12-04T09:25:29.4512889Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2025-12-04T09:25:29.4513821Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2025-12-04T09:25:29.4515735Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2025-12-04T09:25:29.4516670Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2025-12-04T09:25:29.4519538Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2025-12-04T09:25:29.4520527Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2025-12-04T09:25:29.4521372Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2025-12-04T09:25:29.4625176Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2025-12-04T09:25:29.4665349Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2025-12-04T09:25:29.4666001Z creating: build/lib/ 2025-12-04T09:25:29.4749245Z inflating: build/lib/libprotobuf-lite.a 2025-12-04T09:25:29.5193773Z inflating: build/lib/libprotobuf.a 2025-12-04T09:25:29.5691424Z inflating: build/lib/libprotoc.a 2025-12-04T09:25:29.5700511Z inflating: build/lib/libpthreadpool.a 2025-12-04T09:25:29.5709047Z inflating: build/lib/libcpuinfo.a 2025-12-04T09:25:29.5716986Z inflating: build/lib/libcpuinfo_internals.a 2025-12-04T09:25:29.5717755Z inflating: build/lib/libclog.a 2025-12-04T09:25:29.5737747Z inflating: build/lib/libpytorch_qnnpack.a 2025-12-04T09:25:29.5740663Z inflating: build/lib/libnnpack_reference_layers.a 2025-12-04T09:25:29.5758775Z inflating: build/lib/libnnpack.a 2025-12-04T09:25:29.5947041Z inflating: build/lib/libmicrokernels-prod.a 2025-12-04T09:25:29.6828099Z inflating: build/lib/libmicrokernels-all.a 2025-12-04T09:25:29.6898384Z inflating: build/lib/libgtest.a 2025-12-04T09:25:29.6915794Z inflating: build/lib/libgmock.a 2025-12-04T09:25:29.6916439Z inflating: build/lib/libgtest_main.a 2025-12-04T09:25:29.6917567Z inflating: build/lib/libgmock_main.a 2025-12-04T09:25:29.7009009Z inflating: build/lib/libXNNPACK.a 2025-12-04T09:25:29.7085571Z inflating: build/lib/libbenchmark.a 2025-12-04T09:25:29.7086539Z inflating: build/lib/libbenchmark_main.a 2025-12-04T09:25:29.7087477Z inflating: build/lib/libjitprofiling.a 2025-12-04T09:25:29.7153230Z inflating: build/lib/libasmjit.a 2025-12-04T09:25:29.7161304Z inflating: build/lib/libittnotify.a 2025-12-04T09:25:29.8378450Z inflating: build/lib/libfbgemm.a 2025-12-04T09:25:29.8409525Z inflating: build/lib/libtensorpipe_uv.a 2025-12-04T09:25:29.8958822Z inflating: build/lib/libtensorpipe.a 2025-12-04T09:25:29.9205161Z inflating: build/lib/libtensorpipe_cuda.a 2025-12-04T09:25:29.9341251Z inflating: build/lib/libgloo.a 2025-12-04T09:25:29.9388887Z inflating: build/lib/libonnx_proto.a 2025-12-04T09:25:29.9834958Z inflating: build/lib/libgloo_cuda.a 2025-12-04T09:25:30.0554887Z inflating: build/lib/libonnx.a 2025-12-04T09:25:31.0847960Z inflating: build/lib/libdnnl.a 2025-12-04T09:25:31.0867525Z inflating: build/lib/libfmt.a 2025-12-04T09:25:31.1347103Z inflating: build/lib/libkineto.a 2025-12-04T09:25:31.1463932Z inflating: build/lib/libc10.so 2025-12-04T09:25:31.1513702Z inflating: build/lib/libc10_cuda.so 2025-12-04T09:25:31.1515767Z inflating: build/lib/libcaffe2_nvrtc.so 2025-12-04T09:25:31.1517086Z inflating: build/lib/libtorch_global_deps.so 2025-12-04T09:25:34.2768155Z inflating: build/lib/libtorch_cpu.so 2025-12-04T09:25:34.3540370Z inflating: build/lib/libtorch_nvshmem.so 2025-12-04T09:25:36.3384396Z inflating: build/lib/libtorch_cuda.so 2025-12-04T09:25:36.3386427Z inflating: build/lib/libtorch.so 2025-12-04T09:25:36.3439160Z inflating: build/lib/libtorch_cuda_linalg.so 2025-12-04T09:25:36.3511453Z inflating: build/lib/libtorchbind_test.so 2025-12-04T09:25:36.3531088Z inflating: build/lib/libjitbackend_test.so 2025-12-04T09:25:36.3555828Z inflating: build/lib/libbackend_with_compiler.so 2025-12-04T09:25:36.3582483Z inflating: build/lib/libaoti_custom_ops.so 2025-12-04T09:25:36.3585831Z inflating: build/lib/libc10d_cuda_test.so 2025-12-04T09:25:36.3590834Z inflating: build/lib/libshm.so 2025-12-04T09:25:36.5993667Z inflating: build/lib/libtorch_python.so 2025-12-04T09:25:36.6030601Z inflating: build/lib/libnnapi_backend.so 2025-12-04T09:25:36.6031065Z creating: build/bin/ 2025-12-04T09:25:36.6490887Z inflating: build/bin/protoc-3.13.0.0 2025-12-04T09:25:36.6949250Z inflating: build/bin/protoc 2025-12-04T09:25:36.7008832Z inflating: build/bin/c10_AllocatorConfig_test 2025-12-04T09:25:36.7065460Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2025-12-04T09:25:36.7123571Z inflating: build/bin/c10_DeviceGuard_test 2025-12-04T09:25:36.7183761Z inflating: build/bin/c10_Device_test 2025-12-04T09:25:36.7250612Z inflating: build/bin/c10_DispatchKeySet_test 2025-12-04T09:25:36.7306076Z inflating: build/bin/c10_StreamGuard_test 2025-12-04T09:25:36.7366646Z inflating: build/bin/c10_Scalar_test 2025-12-04T09:25:36.7428698Z inflating: build/bin/c10_SizesAndStrides_test 2025-12-04T09:25:36.7489232Z inflating: build/bin/c10_InlineDeviceGuard_test 2025-12-04T09:25:36.7552216Z inflating: build/bin/c10_SymInt_test 2025-12-04T09:25:36.7614233Z inflating: build/bin/c10_InlineStreamGuard_test 2025-12-04T09:25:36.7669785Z inflating: build/bin/c10_ArrayRef_test 2025-12-04T09:25:36.7746811Z inflating: build/bin/c10_cow_test 2025-12-04T09:25:36.7801933Z inflating: build/bin/c10_ConstexprCrc_test 2025-12-04T09:25:36.7858097Z inflating: build/bin/c10_DeadlockDetection_test 2025-12-04T09:25:36.7916497Z inflating: build/bin/c10_Bitset_test 2025-12-04T09:25:36.7980031Z inflating: build/bin/c10_Enumerate_test 2025-12-04T09:25:36.8039639Z inflating: build/bin/c10_IntrusiveList_test 2025-12-04T09:25:36.8095899Z inflating: build/bin/c10_Half_test 2025-12-04T09:25:36.8158661Z inflating: build/bin/c10_LeftRight_test 2025-12-04T09:25:36.8217740Z inflating: build/bin/c10_NetworkFlow_test 2025-12-04T09:25:36.8273063Z inflating: build/bin/c10_Semaphore_test 2025-12-04T09:25:36.8329094Z inflating: build/bin/c10_Synchronized_test 2025-12-04T09:25:36.8390678Z inflating: build/bin/c10_ThreadLocal_test 2025-12-04T09:25:36.8448712Z inflating: build/bin/c10_TypeIndex_test 2025-12-04T09:25:36.8506272Z inflating: build/bin/c10_accumulate_test 2025-12-04T09:25:36.8568633Z inflating: build/bin/c10_bfloat16_test 2025-12-04T09:25:36.8624721Z inflating: build/bin/c10_bit_cast_test 2025-12-04T09:25:36.8686590Z inflating: build/bin/c10_complex_test 2025-12-04T09:25:36.8749585Z inflating: build/bin/c10_complex_math_test 2025-12-04T09:25:36.8805095Z inflating: build/bin/c10_error_test 2025-12-04T09:25:36.8864422Z inflating: build/bin/c10_exception_test 2025-12-04T09:25:36.8919846Z inflating: build/bin/c10_flags_test 2025-12-04T09:25:36.8976236Z inflating: build/bin/c10_generic_math_test 2025-12-04T09:25:36.9033353Z inflating: build/bin/c10_irange_test 2025-12-04T09:25:36.9092767Z inflating: build/bin/c10_lazy_test 2025-12-04T09:25:36.9262048Z inflating: build/bin/c10_intrusive_ptr_test 2025-12-04T09:25:36.9325124Z inflating: build/bin/c10_logging_test 2025-12-04T09:25:36.9381119Z inflating: build/bin/c10_nofatal_test 2025-12-04T09:25:36.9464023Z inflating: build/bin/c10_optional_test 2025-12-04T09:25:36.9522683Z inflating: build/bin/c10_registry_test 2025-12-04T09:25:36.9591290Z inflating: build/bin/c10_ordered_preserving_dict_test 2025-12-04T09:25:36.9757011Z inflating: build/bin/c10_small_vector_test 2025-12-04T09:25:36.9819219Z inflating: build/bin/c10_string_util_test 2025-12-04T09:25:36.9876689Z inflating: build/bin/c10_ssize_test 2025-12-04T09:25:36.9933050Z inflating: build/bin/c10_tempfile_test 2025-12-04T09:25:36.9987936Z inflating: build/bin/c10_string_view_test 2025-12-04T09:25:37.0051450Z inflating: build/bin/c10_typeid_test 2025-12-04T09:25:37.0099523Z inflating: build/bin/c10_intrusive_ptr_benchmark 2025-12-04T09:25:37.0158751Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_thread_and_block_and_device 2025-12-04T09:25:37.0218850Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_stream 2025-12-04T09:25:37.0276608Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_from_2_processes 2025-12-04T09:25:37.0335612Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_1_var_test 2025-12-04T09:25:37.0390637Z inflating: build/bin/c10_cuda_CUDATest 2025-12-04T09:25:37.0450365Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_multiple_blocks 2025-12-04T09:25:37.0509207Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_blocks_and_threads 2025-12-04T09:25:37.0567994Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_same_block 2025-12-04T09:25:37.1183882Z inflating: build/bin/vec_test_all_types_DEFAULT 2025-12-04T09:25:37.1817069Z inflating: build/bin/vec_test_all_types_AVX512 2025-12-04T09:25:37.2459009Z inflating: build/bin/vec_test_all_types_AVX2 2025-12-04T09:25:37.2565362Z inflating: build/bin/test_aoti_abi_check 2025-12-04T09:25:37.2620463Z inflating: build/bin/test_vec_half_DEFAULT 2025-12-04T09:25:37.2676307Z inflating: build/bin/test_vec_half_AVX512 2025-12-04T09:25:37.2732115Z inflating: build/bin/test_vec_half_AVX2 2025-12-04T09:25:37.2812566Z inflating: build/bin/Dict_test 2025-12-04T09:25:37.2871039Z inflating: build/bin/Dimname_test 2025-12-04T09:25:37.2942738Z inflating: build/bin/MaybeOwned_test 2025-12-04T09:25:37.3006390Z inflating: build/bin/NamedTensor_test 2025-12-04T09:25:37.3072076Z inflating: build/bin/apply_utils_test 2025-12-04T09:25:37.3137119Z inflating: build/bin/atest 2025-12-04T09:25:37.3207538Z inflating: build/bin/basic 2025-12-04T09:25:37.3268506Z inflating: build/bin/broadcast_test 2025-12-04T09:25:37.3324775Z inflating: build/bin/cpu_allocator_test 2025-12-04T09:25:37.3389478Z inflating: build/bin/cpu_generator_test 2025-12-04T09:25:37.3448316Z inflating: build/bin/cpu_profiling_allocator_test 2025-12-04T09:25:37.3548740Z inflating: build/bin/cpu_rng_test 2025-12-04T09:25:37.3605342Z inflating: build/bin/dlconvertor_test 2025-12-04T09:25:37.3669502Z inflating: build/bin/extension_backend_test 2025-12-04T09:25:37.3731862Z inflating: build/bin/half_test 2025-12-04T09:25:37.3836682Z inflating: build/bin/ivalue_test 2025-12-04T09:25:37.3892190Z inflating: build/bin/lazy_tensor_test 2025-12-04T09:25:37.3951334Z inflating: build/bin/math_kernel_test 2025-12-04T09:25:37.4010453Z inflating: build/bin/memory_format_test 2025-12-04T09:25:37.4070449Z inflating: build/bin/memory_overlapping_test 2025-12-04T09:25:37.4129924Z inflating: build/bin/mobile_memory_cleanup 2025-12-04T09:25:37.4192166Z inflating: build/bin/native_test 2025-12-04T09:25:37.4248816Z inflating: build/bin/operator_name_test 2025-12-04T09:25:37.4305322Z inflating: build/bin/operators_test 2025-12-04T09:25:37.4363490Z inflating: build/bin/packedtensoraccessor_test 2025-12-04T09:25:37.4438183Z inflating: build/bin/pow_test 2025-12-04T09:25:37.4500987Z inflating: build/bin/quantized_test 2025-12-04T09:25:37.4557133Z inflating: build/bin/reduce_ops_test 2025-12-04T09:25:37.4614427Z inflating: build/bin/reportMemoryUsage_test 2025-12-04T09:25:37.4676178Z inflating: build/bin/scalar_tensor_test 2025-12-04T09:25:37.4739807Z inflating: build/bin/scalar_test 2025-12-04T09:25:37.4797121Z inflating: build/bin/StorageUtils_test 2025-12-04T09:25:37.4855032Z inflating: build/bin/stride_properties_test 2025-12-04T09:25:37.4941607Z inflating: build/bin/tensor_iterator_test 2025-12-04T09:25:37.5002700Z inflating: build/bin/test_parallel 2025-12-04T09:25:37.5059120Z inflating: build/bin/thread_init_test 2025-12-04T09:25:37.5119919Z inflating: build/bin/type_ptr_test 2025-12-04T09:25:37.5185386Z inflating: build/bin/type_test 2025-12-04T09:25:37.5243981Z inflating: build/bin/undefined_tensor_test 2025-12-04T09:25:37.5299341Z inflating: build/bin/verify_api_visibility 2025-12-04T09:25:37.5377035Z inflating: build/bin/legacy_vmap_test 2025-12-04T09:25:37.5434197Z inflating: build/bin/weakref_test 2025-12-04T09:25:37.5491200Z inflating: build/bin/wrapdim_test 2025-12-04T09:25:37.5548621Z inflating: build/bin/xla_tensor_test 2025-12-04T09:25:37.5614323Z inflating: build/bin/IListRef_test 2025-12-04T09:25:37.5728969Z inflating: build/bin/List_test 2025-12-04T09:25:37.5801448Z inflating: build/bin/KernelFunction_test 2025-12-04T09:25:37.5930848Z inflating: build/bin/kernel_function_legacy_test 2025-12-04T09:25:37.6034692Z inflating: build/bin/kernel_function_test 2025-12-04T09:25:37.6170360Z inflating: build/bin/kernel_lambda_legacy_test 2025-12-04T09:25:37.6280410Z inflating: build/bin/kernel_lambda_test 2025-12-04T09:25:37.6346876Z inflating: build/bin/kernel_stackbased_test 2025-12-04T09:25:37.6450458Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2025-12-04T09:25:37.6506981Z inflating: build/bin/CppSignature_test 2025-12-04T09:25:37.6568161Z inflating: build/bin/backend_fallback_test 2025-12-04T09:25:37.6623377Z inflating: build/bin/op_allowlist_test 2025-12-04T09:25:37.6952168Z inflating: build/bin/op_registration_test 2025-12-04T09:25:37.7025629Z inflating: build/bin/inline_container_test 2025-12-04T09:25:37.7085916Z inflating: build/bin/cuda_allocator_test 2025-12-04T09:25:37.7144600Z inflating: build/bin/cuda_apply_test 2025-12-04T09:25:37.7210628Z inflating: build/bin/cuda_atomic_ops_test 2025-12-04T09:25:37.7273135Z inflating: build/bin/cuda_caching_host_allocator_test 2025-12-04T09:25:37.7350111Z inflating: build/bin/cuda_complex_math_test 2025-12-04T09:25:37.7415690Z inflating: build/bin/cuda_complex_test 2025-12-04T09:25:37.7484986Z inflating: build/bin/cuda_cub_test 2025-12-04T09:25:37.7544514Z inflating: build/bin/cuda_cublas_handle_pool_test 2025-12-04T09:25:37.7599878Z inflating: build/bin/cuda_device_test 2025-12-04T09:25:37.7671735Z inflating: build/bin/cuda_distributions_test 2025-12-04T09:25:37.7731189Z inflating: build/bin/cuda_event_test 2025-12-04T09:25:37.7789162Z inflating: build/bin/cuda_dlconvertor_test 2025-12-04T09:25:37.7844624Z inflating: build/bin/cuda_exchange_device_test 2025-12-04T09:25:37.7903630Z inflating: build/bin/cuda_reportMemoryUsage_test 2025-12-04T09:25:37.7959092Z inflating: build/bin/cuda_allocatorTraceTracker_test 2025-12-04T09:25:37.8016087Z inflating: build/bin/cuda_integer_divider_test 2025-12-04T09:25:37.8084043Z inflating: build/bin/cuda_stream_test 2025-12-04T09:25:37.8140015Z inflating: build/bin/cuda_cudnn_test 2025-12-04T09:25:37.8206726Z inflating: build/bin/cuda_half_test 2025-12-04T09:25:37.8259297Z inflating: build/bin/cuda_generator_test 2025-12-04T09:25:37.8314550Z inflating: build/bin/cuda_optional_test 2025-12-04T09:25:37.8372595Z inflating: build/bin/cuda_packedtensoraccessor_test 2025-12-04T09:25:37.8431361Z inflating: build/bin/cuda_vectorized_test 2025-12-04T09:25:37.9567006Z inflating: build/bin/test_jit 2025-12-04T09:25:37.9626499Z inflating: build/bin/BackoffTest 2025-12-04T09:25:37.9684954Z inflating: build/bin/FileStoreTest 2025-12-04T09:25:38.0052532Z inflating: build/bin/test_lazy 2025-12-04T09:25:38.0114976Z inflating: build/bin/TCPStoreTest 2025-12-04T09:25:38.0175059Z inflating: build/bin/HashStoreTest 2025-12-04T09:25:38.0189498Z inflating: build/bin/ProcessGroupMPITest 2025-12-04T09:25:38.0192622Z inflating: build/bin/example_allreduce 2025-12-04T09:25:38.0267120Z inflating: build/bin/ProcessGroupGlooTest 2025-12-04T09:25:38.0329261Z inflating: build/bin/ProcessGroupGlooAsyncTest 2025-12-04T09:25:38.0399782Z inflating: build/bin/ProcessGroupNCCLTest 2025-12-04T09:25:38.0468321Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2025-12-04T09:25:38.0529250Z inflating: build/bin/test_dist_autograd 2025-12-04T09:25:38.0604520Z inflating: build/bin/test_cpp_rpc 2025-12-04T09:25:38.0607174Z inflating: build/bin/parallel_benchmark 2025-12-04T09:25:38.1825875Z inflating: build/bin/test_api 2025-12-04T09:25:38.1829971Z inflating: build/bin/torch_shm_manager 2025-12-04T09:25:38.1830323Z creating: .additional_ci_files/ 2025-12-04T09:25:38.1895524Z inflating: .additional_ci_files/test-times.json 2025-12-04T09:25:38.2133554Z inflating: .additional_ci_files/test-class-times.json 2025-12-04T09:25:38.2183658Z ##[group]Run rm artifacts.zip 2025-12-04T09:25:38.2183973Z rm artifacts.zip 2025-12-04T09:25:38.2195942Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:25:38.2196306Z env: 2025-12-04T09:25:38.2196729Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:38.2197008Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:38.2197328Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:38.2197687Z ##[endgroup] 2025-12-04T09:25:38.5118070Z ##[group]Run df -H 2025-12-04T09:25:38.5118331Z df -H 2025-12-04T09:25:38.5127851Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:25:38.5128231Z env: 2025-12-04T09:25:38.5128516Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:38.5128787Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:38.5129109Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:38.5129457Z ##[endgroup] 2025-12-04T09:25:38.5181659Z Filesystem Size Used Avail Use% Mounted on 2025-12-04T09:25:38.5182159Z devtmpfs 4.2M 0 4.2M 0% /dev 2025-12-04T09:25:38.5182555Z tmpfs 34G 0 34G 0% /dev/shm 2025-12-04T09:25:38.5183005Z tmpfs 14G 562k 14G 1% /run 2025-12-04T09:25:38.5183332Z /dev/nvme0n1p1 161G 54G 108G 34% / 2025-12-04T09:25:38.5183657Z tmpfs 34G 17k 34G 1% /tmp 2025-12-04T09:25:38.5183995Z /dev/nvme0n1p128 11M 1.4M 9.2M 13% /boot/efi 2025-12-04T09:25:38.5184345Z tmpfs 6.7G 0 6.7G 0% /run/user/0 2025-12-04T09:25:38.5228196Z Prepare all required actions 2025-12-04T09:25:38.5229562Z Getting action download info 2025-12-04T09:25:38.6991639Z ##[group]Run ./.github/actions/download-td-artifacts 2025-12-04T09:25:38.6991983Z with: 2025-12-04T09:25:38.6992184Z env: 2025-12-04T09:25:38.6992398Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:38.6992670Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:38.6992991Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:38.6993342Z ##[endgroup] 2025-12-04T09:25:38.7077508Z ##[group]Run seemethere/download-artifact-s3@v4 2025-12-04T09:25:38.7077843Z with: 2025-12-04T09:25:38.7078044Z name: td_results 2025-12-04T09:25:38.7078287Z s3-bucket: gha-artifacts 2025-12-04T09:25:38.7078580Z region: us-east-1 2025-12-04T09:25:38.7078821Z env: 2025-12-04T09:25:38.7079041Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:38.7079305Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:38.7079623Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:38.7080181Z ##[endgroup] 2025-12-04T09:25:39.4052056Z (node:59347) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-12-04T09:25:39.4052529Z 2025-12-04T09:25:39.4052717Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-12-04T09:25:39.4053234Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-12-04T09:25:39.4053769Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-12-04T09:25:39.5132348Z Found 1 objects with prefix pytorch/pytorch/19922826259/td_results/ 2025-12-04T09:25:39.5132993Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/td_results.json 2025-12-04T09:25:39.5628300Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/td_results.json 2025-12-04T09:25:39.5633920Z Artifact download has finished successfully 2025-12-04T09:25:39.5974700Z ##[group]Run mkdir -p .additional_ci_files 2025-12-04T09:25:39.5975107Z mkdir -p .additional_ci_files 2025-12-04T09:25:39.5975547Z mv td_results.json .additional_ci_files/td_results.json || true 2025-12-04T09:25:39.5984919Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:25:39.5985289Z env: 2025-12-04T09:25:39.5985509Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:39.5985783Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:39.5986102Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:39.5986452Z ##[endgroup] 2025-12-04T09:25:39.6152201Z ##[group]Run .github/scripts/parse_ref.py 2025-12-04T09:25:39.6152604Z .github/scripts/parse_ref.py 2025-12-04T09:25:39.6161524Z shell: /usr/bin/bash -e {0} 2025-12-04T09:25:39.6161789Z env: 2025-12-04T09:25:39.6162004Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:39.6162265Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:39.6162588Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:39.6162943Z ##[endgroup] 2025-12-04T09:25:39.6402522Z Setting output branch=main 2025-12-04T09:25:39.6527824Z Prepare all required actions 2025-12-04T09:25:39.6528190Z Getting action download info 2025-12-04T09:25:39.8084945Z ##[group]Run ./.github/actions/filter-test-configs 2025-12-04T09:25:39.8085284Z with: 2025-12-04T09:25:39.8085666Z github-token: *** 2025-12-04T09:25:39.8093816Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}]} 2025-12-04T09:25:39.8102740Z job-name: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T09:25:39.8103491Z env: 2025-12-04T09:25:39.8103708Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:39.8103977Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:39.8104291Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:39.8104648Z ##[endgroup] 2025-12-04T09:25:39.8152665Z ##[group]Run nick-fields/retry@v3.0.0 2025-12-04T09:25:39.8152970Z with: 2025-12-04T09:25:39.8153169Z shell: bash 2025-12-04T09:25:39.8153395Z timeout_minutes: 10 2025-12-04T09:25:39.8153644Z max_attempts: 5 2025-12-04T09:25:39.8153880Z retry_wait_seconds: 30 2025-12-04T09:25:39.8154640Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-12-04T09:25:39.8155612Z polling_interval_seconds: 1 2025-12-04T09:25:39.8155907Z warning_on_retry: true 2025-12-04T09:25:39.8156168Z continue_on_error: false 2025-12-04T09:25:39.8156423Z env: 2025-12-04T09:25:39.8156641Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:39.8156901Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:39.8157220Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:39.8157714Z GITHUB_TOKEN: *** 2025-12-04T09:25:39.8157943Z ##[endgroup] 2025-12-04T09:25:39.9185356Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-12-04T09:25:40.1522285Z Defaulting to user installation because normal site-packages is not writeable 2025-12-04T09:25:40.2765704Z Collecting requests==2.27.1 2025-12-04T09:25:40.2932319Z Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB) 2025-12-04T09:25:40.4911338Z Collecting pyyaml==6.0.2 2025-12-04T09:25:40.4950739Z Downloading PyYAML-6.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (737 kB) 2025-12-04T09:25:40.5254968Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3.9/site-packages (from requests==2.27.1) (1.25.10) 2025-12-04T09:25:40.9724708Z Collecting charset-normalizer~=2.0.0 2025-12-04T09:25:40.9766266Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2025-12-04T09:25:40.9820976Z Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3.9/site-packages (from requests==2.27.1) (2.10) 2025-12-04T09:25:41.0339608Z Collecting certifi>=2017.4.17 2025-12-04T09:25:41.0379201Z Downloading certifi-2025.11.12-py3-none-any.whl (159 kB) 2025-12-04T09:25:41.1313152Z Installing collected packages: charset-normalizer, certifi, requests, pyyaml 2025-12-04T09:25:41.2529573Z Successfully installed certifi-2025.11.12 charset-normalizer-2.0.12 pyyaml-6.0.2 requests-2.27.1 2025-12-04T09:25:41.8949078Z Command completed after 1 attempt(s). 2025-12-04T09:25:41.9016994Z ##[group]Run set -x 2025-12-04T09:25:41.9017256Z set -x 2025-12-04T09:25:41.9017481Z  2025-12-04T09:25:41.9017879Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-12-04T09:25:41.9018348Z # in runner workspace 2025-12-04T09:25:41.9018737Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2025-12-04T09:25:41.9028605Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:25:41.9028982Z env: 2025-12-04T09:25:41.9029203Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:41.9029474Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:41.9029806Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:41.9030179Z ##[endgroup] 2025-12-04T09:25:41.9061749Z + python3 /home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2025-12-04T09:25:41.9248729Z Setting output branch=main 2025-12-04T09:25:41.9304530Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2025-12-04T09:25:41.9305013Z echo "Workflow: ${GITHUB_WORKFLOW}" 2025-12-04T09:25:41.9305479Z echo "Job name: ${JOB_NAME}" 2025-12-04T09:25:41.9305841Z  2025-12-04T09:25:41.9306220Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-12-04T09:25:41.9306688Z # in runner workspace 2025-12-04T09:25:41.9307113Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2025-12-04T09:25:41.9307588Z  --workflow "${GITHUB_WORKFLOW}" \ 2025-12-04T09:25:41.9307914Z  --job-name "${JOB_NAME}" \ 2025-12-04T09:25:41.9316289Z  --test-matrix "{"include": [{"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}]}" \ 2025-12-04T09:25:41.9325437Z  --selected-test-configs "" \ 2025-12-04T09:25:41.9325771Z  --pr-number "${PR_NUMBER}" \ 2025-12-04T09:25:41.9326079Z  --tag "${TAG}" \ 2025-12-04T09:25:41.9326363Z  --event-name "${EVENT_NAME}" \ 2025-12-04T09:25:41.9326682Z  --schedule "${SCHEDULE}" \ 2025-12-04T09:25:41.9326996Z  --branch "${HEAD_BRANCH}" 2025-12-04T09:25:41.9336232Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:25:41.9336599Z env: 2025-12-04T09:25:41.9336816Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:41.9337079Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:41.9337389Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:41.9337956Z GITHUB_TOKEN: *** 2025-12-04T09:25:41.9338621Z JOB_NAME: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T09:25:41.9339310Z PR_NUMBER: 2025-12-04T09:25:41.9339516Z TAG: 2025-12-04T09:25:41.9339723Z EVENT_NAME: schedule 2025-12-04T09:25:41.9339967Z SCHEDULE: 29 8 * * * 2025-12-04T09:25:41.9340199Z HEAD_BRANCH: main 2025-12-04T09:25:41.9340429Z ##[endgroup] 2025-12-04T09:25:41.9372129Z Workflow: periodic 2025-12-04T09:25:41.9373028Z Job name: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T09:25:42.1334665Z Setting output keep-going=True 2025-12-04T09:25:42.1335127Z Setting output ci-verbose-test-logs=False 2025-12-04T09:25:42.1335488Z Setting output ci-test-showlocals=False 2025-12-04T09:25:42.1335823Z Setting output ci-no-test-timeout=False 2025-12-04T09:25:42.1336147Z Setting output ci-no-td=False 2025-12-04T09:25:42.1336703Z Setting output ci-td-distributed=False 2025-12-04T09:25:42.1337035Z Setting output is-unstable=False 2025-12-04T09:25:42.1337346Z Setting output reenabled-issues= 2025-12-04T09:25:42.1355811Z Setting output test-matrix={"include": [{"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}]} 2025-12-04T09:25:42.1374290Z Setting output is-test-matrix-empty=False 2025-12-04T09:25:42.1474923Z ##[group]Run echo "Filtered matrix:" 2025-12-04T09:25:42.1475301Z echo "Filtered matrix:" 2025-12-04T09:25:42.1496450Z echo "{"include": [{"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 7, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 8, "num_shards": 8, "runner": "linux.g5.4xlarge.nvidia.gpu", "owners": ["module:slowgradcheck"], "rerun_disabled_tests": "rerun_disabled_tests"}]}" 2025-12-04T09:25:42.1514788Z  2025-12-04T09:25:42.1515004Z echo 2025-12-04T09:25:42.1515297Z echo "Is the current job unstable? False" 2025-12-04T09:25:42.1515626Z  2025-12-04T09:25:42.1515830Z echo 2025-12-04T09:25:42.1516089Z echo "Is keep-going label set? True" 2025-12-04T09:25:42.1516400Z  2025-12-04T09:25:42.1516607Z echo 2025-12-04T09:25:42.1516849Z echo "Reenabled issues? " 2025-12-04T09:25:42.1526193Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:25:42.1526557Z env: 2025-12-04T09:25:42.1526779Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:42.1527041Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:42.1527362Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:42.1527718Z ##[endgroup] 2025-12-04T09:25:42.1557056Z Filtered matrix: 2025-12-04T09:25:42.1579606Z {include: [{config: default, shard: 1, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check}, {config: default, shard: 1, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 1, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: default, shard: 1, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 2, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check}, {config: default, shard: 2, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 2, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: default, shard: 2, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 3, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check}, {config: default, shard: 3, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 3, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: default, shard: 3, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 4, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check}, {config: default, shard: 4, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 4, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: default, shard: 4, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 5, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check}, {config: default, shard: 5, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 5, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: default, shard: 5, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 6, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check}, {config: default, shard: 6, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 6, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: default, shard: 6, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 7, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check}, {config: default, shard: 7, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 7, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: default, shard: 7, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 8, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check}, {config: default, shard: 8, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 8, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: default, shard: 8, num_shards: 8, runner: linux.g5.4xlarge.nvidia.gpu, owners: [module:slowgradcheck], rerun_disabled_tests: rerun_disabled_tests}]} 2025-12-04T09:25:42.1597649Z 2025-12-04T09:25:42.1597764Z Is the current job unstable? False 2025-12-04T09:25:42.1597969Z 2025-12-04T09:25:42.1598085Z Is keep-going label set? True 2025-12-04T09:25:42.1598276Z 2025-12-04T09:25:42.1598381Z Reenabled issues? 2025-12-04T09:25:42.1629971Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-12-04T09:25:42.1630539Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-12-04T09:25:42.1638882Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:25:42.1639237Z env: 2025-12-04T09:25:42.1639457Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:42.1639723Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:42.1640041Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:42.1640393Z JOB_TIMEOUT: 600 2025-12-04T09:25:42.1640622Z ##[endgroup] 2025-12-04T09:25:42.1695969Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T09:25:42.1696482Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T09:25:42.1696925Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T09:25:42.1705265Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:25:42.1705631Z env: 2025-12-04T09:25:42.1705856Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:42.1706119Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:42.1706440Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:42.1706789Z ##[endgroup] 2025-12-04T09:25:42.1822484Z ##[group]Run set -x 2025-12-04T09:25:42.1822816Z set -x 2025-12-04T09:25:42.1823132Z  2025-12-04T09:25:42.1823391Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2025-12-04T09:25:42.1823780Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2025-12-04T09:25:42.1824177Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2025-12-04T09:25:42.1824748Z  TEST_COMMAND=.ci/onnx/test.sh 2025-12-04T09:25:42.1825041Z else 2025-12-04T09:25:42.1825302Z  TEST_COMMAND=.ci/pytorch/test.sh 2025-12-04T09:25:42.1825606Z fi 2025-12-04T09:25:42.1825806Z  2025-12-04T09:25:42.1826075Z # Leaving 1GB for the runner and other things 2025-12-04T09:25:42.1826636Z TOTAL_AVAILABLE_MEMORY_IN_GB=$(awk '/MemTotal/ { printf "%.3f \n", $2/1024/1024 - 1 }' /proc/meminfo) 2025-12-04T09:25:42.1827463Z # https://docs.docker.com/engine/containers/resource_constraints/#--memory-swap-details, the 3GB swap 2025-12-04T09:25:42.1828150Z # comes from https://github.com/pytorch/test-infra/pull/6058 2025-12-04T09:25:42.1828677Z TOTAL_MEMORY_WITH_SWAP=$(("${TOTAL_AVAILABLE_MEMORY_IN_GB%.*}" + 3)) 2025-12-04T09:25:42.1829082Z  2025-12-04T09:25:42.1829345Z if [[ ${BUILD_ENVIRONMENT} == *"s390x"* ]]; then 2025-12-04T09:25:42.1829686Z  SHM_OPTS= 2025-12-04T09:25:42.1830114Z  JENKINS_USER= 2025-12-04T09:25:42.1830464Z  # ensure that docker container cleanly exits in 12 hours 2025-12-04T09:25:42.1830945Z  # if for some reason cleanup action doesn't stop container 2025-12-04T09:25:42.1831340Z  # when job is cancelled 2025-12-04T09:25:42.1831648Z  DOCKER_SHELL_CMD="sleep 12h" 2025-12-04T09:25:42.1831973Z  USED_IMAGE="${DOCKER_IMAGE_S390X}" 2025-12-04T09:25:42.1832283Z else 2025-12-04T09:25:42.1832539Z  SHM_OPTS="--shm-size=${SHM_SIZE}" 2025-12-04T09:25:42.1832873Z  JENKINS_USER="--user jenkins" 2025-12-04T09:25:42.1833186Z  DOCKER_SHELL_CMD= 2025-12-04T09:25:42.1833478Z  USED_IMAGE="${DOCKER_IMAGE}" 2025-12-04T09:25:42.1833762Z fi 2025-12-04T09:25:42.1833973Z  2025-12-04T09:25:42.1834310Z # detached container should get cleaned up by teardown_ec2_linux 2025-12-04T09:25:42.1834846Z # TODO: Stop building test binaries as part of the build phase 2025-12-04T09:25:42.1835442Z # Used for GPU_FLAG, SHM_OPTS, JENKINS_USER and DOCKER_SHELL_CMD since that doesn't play nice 2025-12-04T09:25:42.1835973Z # shellcheck disable=SC2086,SC2090 2025-12-04T09:25:42.1836316Z container_name=$(docker run \ 2025-12-04T09:25:42.1836622Z  ${GPU_FLAG:-} \ 2025-12-04T09:25:42.1836930Z  ${SCCACHE_SERVER_PORT_DOCKER_FLAG:-} \ 2025-12-04T09:25:42.1837279Z  -e BUILD_ENVIRONMENT \ 2025-12-04T09:25:42.1837574Z  -e PR_NUMBER \ 2025-12-04T09:25:42.1837846Z  -e GITHUB_ACTIONS \ 2025-12-04T09:25:42.1838134Z  -e GITHUB_REPOSITORY \ 2025-12-04T09:25:42.1838433Z  -e GITHUB_WORKFLOW \ 2025-12-04T09:25:42.1838711Z  -e GITHUB_JOB \ 2025-12-04T09:25:42.1838978Z  -e GITHUB_RUN_ID \ 2025-12-04T09:25:42.1839257Z  -e GITHUB_RUN_NUMBER \ 2025-12-04T09:25:42.1839546Z  -e GITHUB_RUN_ATTEMPT \ 2025-12-04T09:25:42.1839841Z  -e JOB_ID \ 2025-12-04T09:25:42.1840099Z  -e JOB_NAME \ 2025-12-04T09:25:42.1840358Z  -e BASE_SHA \ 2025-12-04T09:25:42.1840605Z  -e BRANCH \ 2025-12-04T09:25:42.1840849Z  -e SHA1 \ 2025-12-04T09:25:42.1841103Z  -e AWS_DEFAULT_REGION \ 2025-12-04T09:25:42.1841389Z  -e IN_WHEEL_TEST \ 2025-12-04T09:25:42.1841665Z  -e SHARD_NUMBER \ 2025-12-04T09:25:42.1841939Z  -e TEST_CONFIG \ 2025-12-04T09:25:42.1842203Z  -e NUM_TEST_SHARDS \ 2025-12-04T09:25:42.1842658Z  -e REENABLED_ISSUES \ 2025-12-04T09:25:42.1842969Z  -e CONTINUE_THROUGH_ERROR \ 2025-12-04T09:25:42.1843277Z  -e VERBOSE_TEST_LOGS \ 2025-12-04T09:25:42.1843581Z  -e TEST_SHOWLOCALS \ 2025-12-04T09:25:42.1843867Z  -e NO_TEST_TIMEOUT \ 2025-12-04T09:25:42.1844143Z  -e NO_TD \ 2025-12-04T09:25:42.1844397Z  -e TD_DISTRIBUTED \ 2025-12-04T09:25:42.1844688Z  -e PR_LABELS \ 2025-12-04T09:25:42.1844992Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2025-12-04T09:25:42.1845316Z  -e SCCACHE_BUCKET \ 2025-12-04T09:25:42.1845599Z  -e SCCACHE_REGION \ 2025-12-04T09:25:42.1845873Z  -e XLA_CUDA \ 2025-12-04T09:25:42.1846159Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2025-12-04T09:25:42.1846521Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2025-12-04T09:25:42.1846899Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2025-12-04T09:25:42.1847278Z  -e SKIP_SCCACHE_INITIALIZATION=1 \ 2025-12-04T09:25:42.1847617Z  -e HUGGING_FACE_HUB_TOKEN \ 2025-12-04T09:25:42.1847951Z  -e VLLM_TEST_HUGGING_FACE_TOKEN \ 2025-12-04T09:25:42.1848300Z  -e SCRIBE_GRAPHQL_ACCESS_TOKEN \ 2025-12-04T09:25:42.1848620Z  -e DASHBOARD_TAG \ 2025-12-04T09:25:42.1848914Z  -e ARTIFACTS_FILE_SUFFIX \ 2025-12-04T09:25:42.1849373Z  --memory="${TOTAL_AVAILABLE_MEMORY_IN_GB%.*}g" \ 2025-12-04T09:25:42.1849793Z  --memory-swap="${TOTAL_MEMORY_WITH_SWAP}g" \ 2025-12-04T09:25:42.1850206Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2025-12-04T09:25:42.1850610Z  --security-opt seccomp=unconfined \ 2025-12-04T09:25:42.1850950Z  --cap-add=SYS_PTRACE \ 2025-12-04T09:25:42.1851235Z  --ipc=host \ 2025-12-04T09:25:42.1851496Z  ${SHM_OPTS} \ 2025-12-04T09:25:42.1851746Z  --tty \ 2025-12-04T09:25:42.1851977Z  --detach \ 2025-12-04T09:25:42.1852252Z  --name="${container_name}" \ 2025-12-04T09:25:42.1852564Z  ${JENKINS_USER} \ 2025-12-04T09:25:42.1852904Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2025-12-04T09:25:42.1853306Z  -w /var/lib/jenkins/workspace \ 2025-12-04T09:25:42.1853618Z  "${USED_IMAGE}" \ 2025-12-04T09:25:42.1853892Z  ${DOCKER_SHELL_CMD} 2025-12-04T09:25:42.1863470Z ) 2025-12-04T09:25:42.1863846Z echo "DOCKER_CONTAINER_ID=${container_name}" >> "${GITHUB_ENV}" 2025-12-04T09:25:42.1864259Z  2025-12-04T09:25:42.1864536Z if [[ ${BUILD_ENVIRONMENT} == *"s390x"* ]]; then 2025-12-04T09:25:42.1865111Z  docker exec -t "${container_name}" sh -c "python3 -m pip install -r .ci/docker/requirements-ci.txt" 2025-12-04T09:25:42.1865627Z fi 2025-12-04T09:25:42.1865841Z  2025-12-04T09:25:42.1866335Z docker exec -t "${container_name}" sh -c "python3 -m pip install $(echo dist/*.whl)[opt-einsum] && ${TEST_COMMAND}" 2025-12-04T09:25:42.1875558Z shell: /usr/bin/bash -e {0} 2025-12-04T09:25:42.1875827Z env: 2025-12-04T09:25:42.1876049Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:25:42.1876309Z HAS_NVIDIA_GPU: true 2025-12-04T09:25:42.1876631Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:25:42.1877113Z BUILD_ENVIRONMENT: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck 2025-12-04T09:25:42.1877521Z PR_NUMBER: 2025-12-04T09:25:42.1877776Z GITHUB_REPOSITORY: pytorch/pytorch 2025-12-04T09:25:42.1878092Z GITHUB_WORKFLOW: periodic 2025-12-04T09:25:42.1878353Z GITHUB_JOB: test 2025-12-04T09:25:42.1878596Z GITHUB_RUN_ID: 19922826259 2025-12-04T09:25:42.1879070Z GITHUB_RUN_NUMBER: 19107 2025-12-04T09:25:42.1879340Z GITHUB_RUN_ATTEMPT: 1 2025-12-04T09:25:42.1879582Z JOB_ID: 57118183206 2025-12-04T09:25:42.1880239Z JOB_NAME: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T09:25:42.1881059Z BRANCH: main 2025-12-04T09:25:42.1881328Z SHA1: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:25:42.1881724Z BASE_SHA: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:25:42.1882070Z TEST_CONFIG: default 2025-12-04T09:25:42.1882309Z SHARD_NUMBER: 4 2025-12-04T09:25:42.1882542Z NUM_TEST_SHARDS: 8 2025-12-04T09:25:42.1882779Z EXTRA_FLAGS: 2025-12-04T09:25:42.1883010Z OP_BENCHMARK_TESTS: 2025-12-04T09:25:42.1883255Z REENABLED_ISSUES: 2025-12-04T09:25:42.1883514Z CONTINUE_THROUGH_ERROR: True 2025-12-04T09:25:42.1883804Z VERBOSE_TEST_LOGS: False 2025-12-04T09:25:42.1884069Z TEST_SHOWLOCALS: False 2025-12-04T09:25:42.1884337Z NO_TEST_TIMEOUT: False 2025-12-04T09:25:42.1884583Z NO_TD: False 2025-12-04T09:25:42.1884808Z TD_DISTRIBUTED: False 2025-12-04T09:25:42.1885128Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2025-12-04T09:25:42.1885493Z SCCACHE_REGION: us-east-1 2025-12-04T09:25:42.1885751Z SHM_SIZE: 2g 2025-12-04T09:25:42.1886653Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:25:42.1888326Z DOCKER_IMAGE_S390X: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:25:42.1889275Z XLA_CUDA: 2025-12-04T09:25:42.1889645Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2025-12-04T09:25:42.1890113Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 1 2025-12-04T09:25:42.1890449Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2025-12-04T09:25:42.1890759Z DASHBOARD_TAG: 2025-12-04T09:25:42.1891208Z VLLM_TEST_HUGGING_FACE_TOKEN: *** 2025-12-04T09:25:42.1891627Z HUGGING_FACE_HUB_TOKEN: *** 2025-12-04T09:25:42.1892046Z SCRIBE_GRAPHQL_ACCESS_TOKEN: *** 2025-12-04T09:25:42.1892527Z ARTIFACTS_FILE_SUFFIX: test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206 2025-12-04T09:25:42.1892996Z ##[endgroup] 2025-12-04T09:25:42.1922506Z + [[ default == \m\u\l\t\i\g\p\u ]] 2025-12-04T09:25:42.1922933Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *onnx* ]] 2025-12-04T09:25:42.1923346Z + TEST_COMMAND=.ci/pytorch/test.sh 2025-12-04T09:25:42.1927467Z ++ awk '/MemTotal/ { printf "%.3f \n", $2/1024/1024 - 1 }' /proc/meminfo 2025-12-04T09:25:42.1953819Z + TOTAL_AVAILABLE_MEMORY_IN_GB='61.094 ' 2025-12-04T09:25:42.1954213Z + TOTAL_MEMORY_WITH_SWAP=64 2025-12-04T09:25:42.1954671Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *\s\3\9\0\x* ]] 2025-12-04T09:25:42.1955094Z + SHM_OPTS=--shm-size=2g 2025-12-04T09:25:42.1955372Z + JENKINS_USER='--user jenkins' 2025-12-04T09:25:42.1955642Z + DOCKER_SHELL_CMD= 2025-12-04T09:25:42.1956413Z + USED_IMAGE=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:25:42.1964259Z +++ nproc --ignore=2 2025-12-04T09:25:42.1993408Z ++ docker run --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e TD_DISTRIBUTED -e PR_LABELS -e MAX_JOBS=14 -e SCCACHE_BUCKET -e SCCACHE_REGION -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e SKIP_SCCACHE_INITIALIZATION=1 -e HUGGING_FACE_HUB_TOKEN -e VLLM_TEST_HUGGING_FACE_TOKEN -e SCRIBE_GRAPHQL_ACCESS_TOKEN -e DASHBOARD_TAG -e ARTIFACTS_FILE_SUFFIX --memory=61g --memory-swap=64g --env-file=/tmp/github_env_19922826259 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/ec2-user/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:25:50.7170527Z + container_name=9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T09:25:50.7171487Z + echo DOCKER_CONTAINER_ID=9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T09:25:50.7172203Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *\s\3\9\0\x* ]] 2025-12-04T09:25:50.7176586Z ++ echo dist/torch-2.10.0a0+gitffd9b0f-cp310-cp310-linux_x86_64.whl 2025-12-04T09:25:50.7179656Z + docker exec -t 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e sh -c 'python3 -m pip install dist/torch-2.10.0a0+gitffd9b0f-cp310-cp310-linux_x86_64.whl[opt-einsum] && .ci/pytorch/test.sh' 2025-12-04T09:25:51.1750134Z Processing ./dist/torch-2.10.0a0+gitffd9b0f-cp310-cp310-linux_x86_64.whl (from torch==2.10.0a0+gitffd9b0f) 2025-12-04T09:25:51.4980407Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f->torch==2.10.0a0+gitffd9b0f) (3.18.0) 2025-12-04T09:25:51.4983547Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f->torch==2.10.0a0+gitffd9b0f) (4.12.2) 2025-12-04T09:25:51.4987477Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f->torch==2.10.0a0+gitffd9b0f) (1.13.3) 2025-12-04T09:25:51.4992178Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f->torch==2.10.0a0+gitffd9b0f) (2.8.8) 2025-12-04T09:25:51.4996150Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f->torch==2.10.0a0+gitffd9b0f) (3.1.6) 2025-12-04T09:25:51.5001344Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f->torch==2.10.0a0+gitffd9b0f) (2025.10.0) 2025-12-04T09:25:51.5016364Z Requirement already satisfied: opt-einsum>=3.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f->torch==2.10.0a0+gitffd9b0f) (3.3.0) 2025-12-04T09:25:51.5417772Z Requirement already satisfied: numpy>=1.7 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from opt-einsum>=3.3->torch==2.10.0a0+gitffd9b0f->torch==2.10.0a0+gitffd9b0f) (1.22.4) 2025-12-04T09:25:51.5437267Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch==2.10.0a0+gitffd9b0f->torch==2.10.0a0+gitffd9b0f) (1.3.0) 2025-12-04T09:25:51.5502090Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.10.0a0+gitffd9b0f->torch==2.10.0a0+gitffd9b0f) (3.0.3) 2025-12-04T09:25:51.9362113Z Installing collected packages: torch 2025-12-04T09:26:03.4786616Z Successfully installed torch-2.10.0a0+gitffd9b0f 2025-12-04T09:26:03.5532487Z + export TERM=vt100 2025-12-04T09:26:03.5532731Z + TERM=vt100 2025-12-04T09:26:03.5535342Z ++ dirname .ci/pytorch/test.sh 2025-12-04T09:26:03.5545540Z + source .ci/pytorch/common.sh 2025-12-04T09:26:03.5550165Z +++ dirname .ci/pytorch/common.sh 2025-12-04T09:26:03.5560723Z ++ source .ci/pytorch/common_utils.sh 2025-12-04T09:26:03.5562382Z +++ declare -f -t trap_add 2025-12-04T09:26:03.5567328Z ++ set -ex -o pipefail 2025-12-04T09:26:03.5568011Z ++ [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *rocm* ]] 2025-12-04T09:26:03.5568470Z ++ BUILD_TEST_LIBTORCH=0 2025-12-04T09:26:03.5571438Z ++ dirname .ci/pytorch/test.sh 2025-12-04T09:26:03.5580559Z + source .ci/pytorch/common-build.sh 2025-12-04T09:26:03.5581325Z ++ [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck != *win-* ]] 2025-12-04T09:26:03.5588865Z ++++ dirname .ci/pytorch/common-build.sh 2025-12-04T09:26:03.5598597Z +++ cd .ci/pytorch 2025-12-04T09:26:03.5598849Z +++ pwd -P 2025-12-04T09:26:03.5601925Z ++ script_dir=/var/lib/jenkins/workspace/.ci/pytorch 2025-12-04T09:26:03.5602439Z ++ [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *-pch* ]] 2025-12-04T09:26:03.5602844Z ++ which sccache 2025-12-04T09:26:03.5674294Z ++ [[ -z ossci-compiler-cache-circleci-v2 ]] 2025-12-04T09:26:03.5674674Z ++ sccache --stop-server 2025-12-04T09:26:03.5704816Z ++ true 2025-12-04T09:26:03.5705087Z ++ rm -f /var/lib/jenkins/sccache_error.log 2025-12-04T09:26:03.5715753Z ++ trap_add sccache_epilogue EXIT 2025-12-04T09:26:03.5716085Z ++ trap_add_cmd=sccache_epilogue 2025-12-04T09:26:03.5716362Z ++ shift 2025-12-04T09:26:03.5716583Z ++ for trap_add_name in "$@" 2025-12-04T09:26:03.5723153Z ++++ trap -p EXIT 2025-12-04T09:26:03.5726774Z +++ eval 'extract_trap_cmd ' 2025-12-04T09:26:03.5727140Z ++++ extract_trap_cmd 2025-12-04T09:26:03.5727408Z ++++ printf '%s\n' '' 2025-12-04T09:26:03.5728091Z +++ printf '%s\n' sccache_epilogue 2025-12-04T09:26:03.5731571Z ++ trap -- ' 2025-12-04T09:26:03.5731868Z sccache_epilogue' EXIT 2025-12-04T09:26:03.5732165Z ++ [[ -n 1 ]] 2025-12-04T09:26:03.5732634Z ++ echo 'Skipping sccache server initialization, setting environment variables' 2025-12-04T09:26:03.5733421Z Skipping sccache server initialization, setting environment variables 2025-12-04T09:26:03.5734260Z ++ export SCCACHE_IDLE_TIMEOUT=0 2025-12-04T09:26:03.5734659Z ++ SCCACHE_IDLE_TIMEOUT=0 2025-12-04T09:26:03.5735017Z ++ export SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-12-04T09:26:03.5735463Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-12-04T09:26:03.5740395Z ++ export RUST_LOG=sccache::server=error 2025-12-04T09:26:03.5740781Z ++ RUST_LOG=sccache::server=error 2025-12-04T09:26:03.5741079Z ++ sccache --zero-stats 2025-12-04T09:26:03.8923894Z Statistics zeroed. 2025-12-04T09:26:03.8933359Z ++ which ccache 2025-12-04T09:26:03.9033860Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck != *rocm* ]] 2025-12-04T09:26:03.9034503Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck != *s390x* ]] 2025-12-04T09:26:03.9034951Z + [[ -d /var/lib/jenkins/workspace ]] 2025-12-04T09:26:03.9036629Z ++ stat -c %u /var/lib/jenkins/workspace 2025-12-04T09:26:03.9054093Z + WORKSPACE_ORIGINAL_OWNER_ID=1000 2025-12-04T09:26:03.9054547Z + trap_add cleanup_workspace EXIT 2025-12-04T09:26:03.9055033Z + trap_add_cmd=cleanup_workspace 2025-12-04T09:26:03.9055305Z + shift 2025-12-04T09:26:03.9055523Z + for trap_add_name in "$@" 2025-12-04T09:26:03.9063623Z +++ trap -p EXIT 2025-12-04T09:26:03.9066437Z ++ eval 'extract_trap_cmd trap -- '\'' 2025-12-04T09:26:03.9066798Z sccache_epilogue'\'' EXIT' 2025-12-04T09:26:03.9067090Z +++ extract_trap_cmd trap -- ' 2025-12-04T09:26:03.9067373Z sccache_epilogue' EXIT 2025-12-04T09:26:03.9067617Z +++ printf '%s\n' ' 2025-12-04T09:26:03.9067911Z sccache_epilogue' 2025-12-04T09:26:03.9068274Z ++ printf '%s\n' cleanup_workspace 2025-12-04T09:26:03.9069437Z + trap -- ' 2025-12-04T09:26:03.9069652Z sccache_epilogue 2025-12-04T09:26:03.9069897Z cleanup_workspace' EXIT 2025-12-04T09:26:03.9070213Z + sudo chown -R jenkins /var/lib/jenkins/workspace 2025-12-04T09:26:04.9553204Z + git config --global --add safe.directory /var/lib/jenkins/workspace 2025-12-04T09:26:04.9572586Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *cuda* ]] 2025-12-04T09:26:04.9576049Z ++ python -c 'import os;import numba.cuda; print(os.path.dirname(numba.cuda.__file__))' 2025-12-04T09:26:05.3863762Z + NUMBA_CUDA_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda 2025-12-04T09:26:05.3864395Z + '[' -n /opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda ']' 2025-12-04T09:26:05.3869956Z +++ realpath .ci/pytorch/test.sh 2025-12-04T09:26:05.3884216Z ++ dirname /var/lib/jenkins/workspace/.ci/pytorch/test.sh 2025-12-04T09:26:05.4009186Z + NUMBA_PATCH=/var/lib/jenkins/workspace/.ci/pytorch/numba-cuda-13.patch 2025-12-04T09:26:05.4010113Z + pushd /opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda 2025-12-04T09:26:05.4010691Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda ~/workspace 2025-12-04T09:26:05.4011147Z + patch -p4 2025-12-04T09:26:05.4025968Z patching file cudadrv/driver.py 2025-12-04T09:26:05.4026351Z Hunk #1 succeeded at 357 (offset -8 lines). 2025-12-04T09:26:05.4081688Z + popd 2025-12-04T09:26:05.4081936Z ~/workspace 2025-12-04T09:26:05.4082177Z + echo 'Environment variables:' 2025-12-04T09:26:05.4082473Z Environment variables: 2025-12-04T09:26:05.4082707Z + env 2025-12-04T09:26:05.4092187Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-12-04T09:26:05.4092961Z CONTINUE_THROUGH_ERROR=True 2025-12-04T09:26:05.4093691Z BUILD_ENVIRONMENT=linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck 2025-12-04T09:26:05.4094429Z VLLM_TEST_HUGGING_FACE_TOKEN=*** 2025-12-04T09:26:05.4094728Z HOSTNAME=9eab885271a8 2025-12-04T09:26:05.4095310Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_72c5acef-13d9-4000-8e69-b6aad7e1627c 2025-12-04T09:26:05.4095920Z GITHUB_ACTION=__run_3 2025-12-04T09:26:05.4096200Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T09:26:05.4096505Z GITHUB_RUN_NUMBER=19107 2025-12-04T09:26:05.4096763Z TEST_CONFIG=default 2025-12-04T09:26:05.4097022Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-12-04T09:26:05.4097350Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2025-12-04T09:26:05.4097895Z SCCACHE_IDLE_TIMEOUT=0 2025-12-04T09:26:05.4098328Z SCRIBE_GRAPHQL_ACCESS_TOKEN=*** 2025-12-04T09:26:05.4098627Z GITHUB_TRIGGERING_ACTOR=huydhn 2025-12-04T09:26:05.4098907Z GITHUB_REF_TYPE=branch 2025-12-04T09:26:05.4099202Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:26:05.4099543Z XLA_CUDA= 2025-12-04T09:26:05.4099775Z NCCL_LIB_DIR=/usr/local/cuda/lib64/ 2025-12-04T09:26:05.4100208Z HUGGING_FACE_HUB_TOKEN=*** 2025-12-04T09:26:05.4100802Z *** 2025-12-04T09:26:05.4101033Z GITHUB_REPOSITORY_ID=65600975 2025-12-04T09:26:05.4101387Z GITHUB_ACTIONS=true 2025-12-04T09:26:05.4101649Z NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:26:05.4101998Z SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-12-04T09:26:05.4102393Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:26:05.4102778Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:26:05.4103396Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/periodic.yml@refs/heads/main 2025-12-04T09:26:05.4103882Z UCC_HOME=/usr 2025-12-04T09:26:05.4104121Z VERBOSE_TEST_LOGS=False 2025-12-04T09:26:05.4104377Z GITHUB_REF=refs/heads/main 2025-12-04T09:26:05.4104637Z SHARD_NUMBER=4 2025-12-04T09:26:05.4104875Z GITHUB_REF_PROTECTED=true 2025-12-04T09:26:05.4105135Z HOME=/var/lib/jenkins 2025-12-04T09:26:05.4105425Z GITHUB_API_URL=https://api.github.com 2025-12-04T09:26:05.4105760Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-12-04T09:26:05.4106108Z UCX_COMMIT=7836b165abdbe468a2f607e7254011c07d788152 2025-12-04T09:26:05.4106445Z USE_SYSTEM_NCCL=1 2025-12-04T09:26:05.4106680Z NUM_TEST_SHARDS=8 2025-12-04T09:26:05.4106909Z UCX_HOME=/usr 2025-12-04T09:26:05.4107463Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_72c5acef-13d9-4000-8e69-b6aad7e1627c 2025-12-04T09:26:05.4108495Z JOB_NAME=linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T09:26:05.4109492Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_72c5acef-13d9-4000-8e69-b6aad7e1627c 2025-12-04T09:26:05.4110276Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2025-12-04T09:26:05.4110772Z GITHUB_EVENT_NAME=schedule 2025-12-04T09:26:05.4111035Z DASHBOARD_TAG= 2025-12-04T09:26:05.4111259Z GITHUB_RUN_ID=19922826259 2025-12-04T09:26:05.4111518Z INSTALLED_OPENBLAS= 2025-12-04T09:26:05.4112123Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_72c5acef-13d9-4000-8e69-b6aad7e1627c 2025-12-04T09:26:05.4112772Z GITHUB_ACTOR=huydhn 2025-12-04T09:26:05.4113174Z PR_NUMBER= 2025-12-04T09:26:05.4113395Z DESIRED_CUDA=12.8.1 2025-12-04T09:26:05.4113631Z GITHUB_RUN_ATTEMPT=1 2025-12-04T09:26:05.4113936Z ANACONDA_PYTHON_VERSION=3.10 2025-12-04T09:26:05.4114277Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-12-04T09:26:05.4114622Z TERM=vt100 2025-12-04T09:26:05.4114833Z INSTALLED_VISION=yes 2025-12-04T09:26:05.4115080Z BRANCH=main 2025-12-04T09:26:05.4115300Z SCCACHE_REGION=us-east-1 2025-12-04T09:26:05.4115566Z OPENSSL_ROOT_DIR=/opt/openssl 2025-12-04T09:26:05.4115845Z BUILD_AOT_INDUCTOR_TEST= 2025-12-04T09:26:05.4116107Z CUDA_PATH=/usr/local/cuda 2025-12-04T09:26:05.4116610Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2025-12-04T09:26:05.4117177Z GITHUB_SERVER_URL=https://github.com 2025-12-04T09:26:05.4117529Z UCC_COMMIT=430e241bf5d38cbc73fc7a6b89155397232e3f96 2025-12-04T09:26:05.4117863Z REENABLED_ISSUES= 2025-12-04T09:26:05.4118085Z DOCS= 2025-12-04T09:26:05.4118280Z SHLVL=1 2025-12-04T09:26:05.4118489Z MAX_JOBS=14 2025-12-04T09:26:05.4118701Z GITHUB_ACTOR_ID=475357 2025-12-04T09:26:05.4119046Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:26:05.4119428Z GITHUB_REF_NAME=main 2025-12-04T09:26:05.4119800Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2025-12-04T09:26:05.4120224Z GITHUB_JOB=test 2025-12-04T09:26:05.4120563Z NO_TEST_TIMEOUT=False 2025-12-04T09:26:05.4120807Z TD_DISTRIBUTED=False 2025-12-04T09:26:05.4121077Z GITHUB_REPOSITORY=pytorch/pytorch 2025-12-04T09:26:05.4121377Z GITHUB_RETENTION_DAYS=90 2025-12-04T09:26:05.4121644Z OPENSSL_DIR=/opt/openssl 2025-12-04T09:26:05.4121915Z GITHUB_ACTION_REPOSITORY= 2025-12-04T09:26:05.4122664Z PATH=/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T09:26:05.4123433Z GITHUB_BASE_REF= 2025-12-04T09:26:05.4123657Z INSTALLED_ACL= 2025-12-04T09:26:05.4124064Z ARTIFACTS_FILE_SUFFIX=test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206 2025-12-04T09:26:05.4124882Z CI=true 2025-12-04T09:26:05.4125103Z GITHUB_REPOSITORY_OWNER=pytorch 2025-12-04T09:26:05.4125430Z RUST_LOG=sccache::server=error 2025-12-04T09:26:05.4125701Z JOB_ID=57118183206 2025-12-04T09:26:05.4125920Z GITHUB_HEAD_REF= 2025-12-04T09:26:05.4126147Z GITHUB_ACTION_REF= 2025-12-04T09:26:05.4126448Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2025-12-04T09:26:05.4126786Z TEST_SHOWLOCALS=False 2025-12-04T09:26:05.4127041Z GITHUB_WORKFLOW=periodic 2025-12-04T09:26:05.4127310Z DEBIAN_FRONTEND=noninteractive 2025-12-04T09:26:05.4127907Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_72c5acef-13d9-4000-8e69-b6aad7e1627c 2025-12-04T09:26:05.4128515Z NO_TD=False 2025-12-04T09:26:05.4128746Z SKIP_SCCACHE_INITIALIZATION=1 2025-12-04T09:26:05.4129056Z NCCL_INCLUDE_DIR=/usr/local/cuda/include/ 2025-12-04T09:26:05.4129348Z _=/usr/bin/env 2025-12-04T09:26:05.4129703Z OLDPWD=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda 2025-12-04T09:26:05.4130217Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2025-12-04T09:26:05.4240910Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2025-12-04T09:26:05.4241626Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-12-04T09:26:05.4242233Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2025-12-04T09:26:05.4242876Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2025-12-04T09:26:05.4243404Z + BUILD_DIR=build 2025-12-04T09:26:05.4243677Z + BUILD_RENAMED_DIR=build_renamed 2025-12-04T09:26:05.4243974Z + BUILD_BIN_DIR=build/bin 2025-12-04T09:26:05.4244223Z + SHARD_NUMBER=4 2025-12-04T09:26:05.4244456Z + NUM_TEST_SHARDS=8 2025-12-04T09:26:05.4244720Z + export TORCH_SERIALIZATION_DEBUG=1 2025-12-04T09:26:05.4245031Z + TORCH_SERIALIZATION_DEBUG=1 2025-12-04T09:26:05.4245387Z + export VALGRIND=ON 2025-12-04T09:26:05.4245994Z + VALGRIND=ON 2025-12-04T09:26:05.4246415Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *clang9* ]] 2025-12-04T09:26:05.4246914Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *xpu* ]] 2025-12-04T09:26:05.4247324Z + detect_cuda_arch 2025-12-04T09:26:05.4247660Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *cuda* ]] 2025-12-04T09:26:05.4248061Z + command -v nvidia-smi 2025-12-04T09:26:05.4248321Z /usr/bin/nvidia-smi 2025-12-04T09:26:05.4253225Z ++ nvidia-smi --query-gpu=compute_cap --format=csv 2025-12-04T09:26:05.4253725Z ++ tail -n 1 2025-12-04T09:26:05.4524107Z + TORCH_CUDA_ARCH_LIST=8.6 2025-12-04T09:26:05.4524835Z + export TORCH_CUDA_ARCH_LIST 2025-12-04T09:26:05.4525239Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *s390x* ]] 2025-12-04T09:26:05.4525632Z + [[ 0 == \1 ]] 2025-12-04T09:26:05.4525847Z + [[ True == \1 ]] 2025-12-04T09:26:05.4526194Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck != *bazel* ]] 2025-12-04T09:26:05.4526640Z ++ realpath build/custom_test_artifacts 2025-12-04T09:26:05.4672537Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/workspace/build/custom_test_artifacts 2025-12-04T09:26:05.4673026Z + [[ -n '' ]] 2025-12-04T09:26:05.4673268Z + echo 'Environment variables' 2025-12-04T09:26:05.4673552Z Environment variables 2025-12-04T09:26:05.4673786Z + env 2025-12-04T09:26:05.4826568Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-12-04T09:26:05.4827517Z CONTINUE_THROUGH_ERROR=True 2025-12-04T09:26:05.4828032Z BUILD_ENVIRONMENT=linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck 2025-12-04T09:26:05.4828671Z VLLM_TEST_HUGGING_FACE_TOKEN=*** 2025-12-04T09:26:05.4828955Z HOSTNAME=9eab885271a8 2025-12-04T09:26:05.4829527Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_72c5acef-13d9-4000-8e69-b6aad7e1627c 2025-12-04T09:26:05.4830200Z GITHUB_ACTION=__run_3 2025-12-04T09:26:05.4830469Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T09:26:05.4830774Z GITHUB_RUN_NUMBER=19107 2025-12-04T09:26:05.4831040Z TEST_CONFIG=default 2025-12-04T09:26:05.4831290Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-12-04T09:26:05.4831618Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2025-12-04T09:26:05.4831935Z SCCACHE_IDLE_TIMEOUT=0 2025-12-04T09:26:05.4832318Z SCRIBE_GRAPHQL_ACCESS_TOKEN=*** 2025-12-04T09:26:05.4832610Z GITHUB_TRIGGERING_ACTOR=huydhn 2025-12-04T09:26:05.4832895Z GITHUB_REF_TYPE=branch 2025-12-04T09:26:05.4833156Z TORCH_CUDA_ARCH_LIST=8.6 2025-12-04T09:26:05.4833459Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:26:05.4833795Z XLA_CUDA= 2025-12-04T09:26:05.4834032Z NCCL_LIB_DIR=/usr/local/cuda/lib64/ 2025-12-04T09:26:05.4834624Z HUGGING_FACE_HUB_TOKEN=*** 2025-12-04T09:26:05.4834941Z *** 2025-12-04T09:26:05.4835162Z GITHUB_REPOSITORY_ID=65600975 2025-12-04T09:26:05.4835443Z GITHUB_ACTIONS=true 2025-12-04T09:26:05.4835742Z NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T09:26:05.4836165Z SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-12-04T09:26:05.4836649Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:26:05.4837122Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:26:05.4837847Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/periodic.yml@refs/heads/main 2025-12-04T09:26:05.4838499Z UCC_HOME=/usr 2025-12-04T09:26:05.4838827Z TORCH_SERIALIZATION_DEBUG=1 2025-12-04T09:26:05.4839200Z VERBOSE_TEST_LOGS=False 2025-12-04T09:26:05.4839561Z GITHUB_REF=refs/heads/main 2025-12-04T09:26:05.4839913Z SHARD_NUMBER=4 2025-12-04T09:26:05.4840195Z GITHUB_REF_PROTECTED=true 2025-12-04T09:26:05.4840569Z HOME=/var/lib/jenkins 2025-12-04T09:26:05.4840952Z GITHUB_API_URL=https://api.github.com 2025-12-04T09:26:05.4841391Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-12-04T09:26:05.4841808Z UCX_COMMIT=7836b165abdbe468a2f607e7254011c07d788152 2025-12-04T09:26:05.4842185Z USE_SYSTEM_NCCL=1 2025-12-04T09:26:05.4842413Z NUM_TEST_SHARDS=8 2025-12-04T09:26:05.4842633Z UCX_HOME=/usr 2025-12-04T09:26:05.4843410Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_72c5acef-13d9-4000-8e69-b6aad7e1627c 2025-12-04T09:26:05.4844438Z JOB_NAME=linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T09:26:05.4845426Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_72c5acef-13d9-4000-8e69-b6aad7e1627c 2025-12-04T09:26:05.4846212Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2025-12-04T09:26:05.4846703Z GITHUB_EVENT_NAME=schedule 2025-12-04T09:26:05.4846957Z DASHBOARD_TAG= 2025-12-04T09:26:05.4847189Z GITHUB_RUN_ID=19922826259 2025-12-04T09:26:05.4847445Z INSTALLED_OPENBLAS= 2025-12-04T09:26:05.4848034Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_72c5acef-13d9-4000-8e69-b6aad7e1627c 2025-12-04T09:26:05.4848681Z GITHUB_ACTOR=huydhn 2025-12-04T09:26:05.4848916Z PR_NUMBER= 2025-12-04T09:26:05.4849123Z DESIRED_CUDA=12.8.1 2025-12-04T09:26:05.4849368Z GITHUB_RUN_ATTEMPT=1 2025-12-04T09:26:05.4849607Z VALGRIND=ON 2025-12-04T09:26:05.4849829Z ANACONDA_PYTHON_VERSION=3.10 2025-12-04T09:26:05.4850166Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-12-04T09:26:05.4850510Z TERM=vt100 2025-12-04T09:26:05.4850720Z INSTALLED_VISION=yes 2025-12-04T09:26:05.4850960Z BRANCH=main 2025-12-04T09:26:05.4851182Z SCCACHE_REGION=us-east-1 2025-12-04T09:26:05.4852608Z OPENSSL_ROOT_DIR=/opt/openssl 2025-12-04T09:26:05.4852888Z BUILD_AOT_INDUCTOR_TEST= 2025-12-04T09:26:05.4853150Z CUDA_PATH=/usr/local/cuda 2025-12-04T09:26:05.4853658Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2025-12-04T09:26:05.4854212Z GITHUB_SERVER_URL=https://github.com 2025-12-04T09:26:05.4854563Z UCC_COMMIT=430e241bf5d38cbc73fc7a6b89155397232e3f96 2025-12-04T09:26:05.4854900Z REENABLED_ISSUES= 2025-12-04T09:26:05.4855111Z DOCS= 2025-12-04T09:26:05.4855308Z SHLVL=1 2025-12-04T09:26:05.4855508Z MAX_JOBS=14 2025-12-04T09:26:05.4855722Z GITHUB_ACTOR_ID=475357 2025-12-04T09:26:05.4856057Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:26:05.4856436Z GITHUB_REF_NAME=main 2025-12-04T09:26:05.4856803Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2025-12-04T09:26:05.4857221Z GITHUB_JOB=test 2025-12-04T09:26:05.4857449Z NO_TEST_TIMEOUT=False 2025-12-04T09:26:05.4857692Z TD_DISTRIBUTED=False 2025-12-04T09:26:05.4857966Z GITHUB_REPOSITORY=pytorch/pytorch 2025-12-04T09:26:05.4858265Z GITHUB_RETENTION_DAYS=90 2025-12-04T09:26:05.4858522Z OPENSSL_DIR=/opt/openssl 2025-12-04T09:26:05.4858789Z GITHUB_ACTION_REPOSITORY= 2025-12-04T09:26:05.4859531Z PATH=/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T09:26:05.4860288Z GITHUB_BASE_REF= 2025-12-04T09:26:05.4860514Z INSTALLED_ACL= 2025-12-04T09:26:05.4860919Z ARTIFACTS_FILE_SUFFIX=test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206 2025-12-04T09:26:05.4861370Z CI=true 2025-12-04T09:26:05.4861614Z GITHUB_REPOSITORY_OWNER=pytorch 2025-12-04T09:26:05.4861938Z RUST_LOG=sccache::server=error 2025-12-04T09:26:05.4862236Z JOB_ID=57118183206 2025-12-04T09:26:05.4862487Z GITHUB_HEAD_REF= 2025-12-04T09:26:05.4862712Z GITHUB_ACTION_REF= 2025-12-04T09:26:05.4863139Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2025-12-04T09:26:05.4863489Z TEST_SHOWLOCALS=False 2025-12-04T09:26:05.4863737Z GITHUB_WORKFLOW=periodic 2025-12-04T09:26:05.4864009Z DEBIAN_FRONTEND=noninteractive 2025-12-04T09:26:05.4864613Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_72c5acef-13d9-4000-8e69-b6aad7e1627c 2025-12-04T09:26:05.4865218Z NO_TD=False 2025-12-04T09:26:05.4865453Z SKIP_SCCACHE_INITIALIZATION=1 2025-12-04T09:26:05.4865760Z NCCL_INCLUDE_DIR=/usr/local/cuda/include/ 2025-12-04T09:26:05.4866193Z OLDPWD=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/numba/cuda 2025-12-04T09:26:05.4866601Z _=/usr/bin/env 2025-12-04T09:26:05.4866929Z + echo 'Testing pytorch' 2025-12-04T09:26:05.4867185Z Testing pytorch 2025-12-04T09:26:05.4867421Z + export LANG=C.UTF-8 2025-12-04T09:26:05.4867662Z + LANG=C.UTF-8 2025-12-04T09:26:05.4867875Z + PR_NUMBER= 2025-12-04T09:26:05.4868094Z + [[ default == \d\e\f\a\u\l\t ]] 2025-12-04T09:26:05.4868383Z + export CUDA_VISIBLE_DEVICES=0 2025-12-04T09:26:05.4868661Z + CUDA_VISIBLE_DEVICES=0 2025-12-04T09:26:05.4868928Z + export HIP_VISIBLE_DEVICES=0 2025-12-04T09:26:05.4869208Z + HIP_VISIBLE_DEVICES=0 2025-12-04T09:26:05.4869472Z + [[ default == \d\i\s\t\r\i\b\u\t\e\d ]] 2025-12-04T09:26:05.4869765Z + [[ default == \s\l\o\w ]] 2025-12-04T09:26:05.4870170Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *slow-gradcheck* ]] 2025-12-04T09:26:05.4870638Z + export PYTORCH_TEST_WITH_SLOW_GRADCHECK=1 2025-12-04T09:26:05.4870973Z + PYTORCH_TEST_WITH_SLOW_GRADCHECK=1 2025-12-04T09:26:05.4871299Z + export PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T09:26:05.4871638Z + PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T09:26:05.4872083Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *cuda* ]] 2025-12-04T09:26:05.4872513Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-12-04T09:26:05.4872856Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-12-04T09:26:05.4873160Z + [[ default == *crossref* ]] 2025-12-04T09:26:05.4873524Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *rocm* ]] 2025-12-04T09:26:05.4874103Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *xpu* ]] 2025-12-04T09:26:05.4874607Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck != *-bazel-* ]] 2025-12-04T09:26:05.4875014Z + pip_install ninja==1.10.2 2025-12-04T09:26:05.4875381Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-12-04T09:26:05.4875840Z + python3 -m pip install --progress-bar off ninja==1.10.2 2025-12-04T09:26:06.1513949Z Collecting ninja==1.10.2 2025-12-04T09:26:06.1764522Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2025-12-04T09:26:06.2068005Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2025-12-04T09:26:06.6125915Z Installing collected packages: ninja 2025-12-04T09:26:06.6126282Z Attempting uninstall: ninja 2025-12-04T09:26:06.6135284Z Found existing installation: ninja 1.11.1.4 2025-12-04T09:26:06.6159263Z Uninstalling ninja-1.11.1.4: 2025-12-04T09:26:06.6257189Z Successfully uninstalled ninja-1.11.1.4 2025-12-04T09:26:06.6955976Z Successfully installed ninja-1.10.2 2025-12-04T09:26:06.7543525Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T09:26:06.7545019Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T09:26:06.7546025Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *aarch64* ]] 2025-12-04T09:26:06.7546549Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *asan* ]] 2025-12-04T09:26:06.7547121Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *-debug* ]] 2025-12-04T09:26:06.7547671Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck != *-bazel-* ]] 2025-12-04T09:26:06.7548437Z + echo 'We are not in debug mode: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck. Expect the assertion to pass' 2025-12-04T09:26:06.7549254Z We are not in debug mode: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck. Expect the assertion to pass 2025-12-04T09:26:06.7549775Z + cd test 2025-12-04T09:26:06.7550120Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2025-12-04T09:26:08.4194204Z + [[ default == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2025-12-04T09:26:08.4194626Z + [[ default == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2025-12-04T09:26:08.4194984Z + [[ default == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]] 2025-12-04T09:26:08.4198843Z + DYNAMO_BENCHMARK_FLAGS=() 2025-12-04T09:26:08.4199672Z + [[ default == *pr_time_benchmarks* ]] 2025-12-04T09:26:08.4200054Z + [[ default == *dynamo_eager* ]] 2025-12-04T09:26:08.4200343Z + [[ default == *aot_eager* ]] 2025-12-04T09:26:08.4200624Z + [[ default == *aot_inductor* ]] 2025-12-04T09:26:08.4200939Z + [[ default == *max_autotune_inductor* ]] 2025-12-04T09:26:08.4201256Z + [[ default == *inductor* ]] 2025-12-04T09:26:08.4201549Z + [[ default == *dynamic* ]] 2025-12-04T09:26:08.4201824Z + [[ default == *cpu* ]] 2025-12-04T09:26:08.4202075Z + [[ default == *xpu* ]] 2025-12-04T09:26:08.4202360Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2025-12-04T09:26:08.4231277Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *libtorch* ]] 2025-12-04T09:26:08.4231825Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *-bazel-* ]] 2025-12-04T09:26:08.4235303Z + cd test 2025-12-04T09:26:08.4235660Z + python -c 'import torch; print(torch.__config__.show())' 2025-12-04T09:26:10.0912596Z PyTorch built with: 2025-12-04T09:26:10.0913008Z - GCC 11.4 2025-12-04T09:26:10.0913314Z - C++ Version: 201703 2025-12-04T09:26:10.0913948Z - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-12-04T09:26:10.0914649Z - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-12-04T09:26:10.0915090Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-12-04T09:26:10.0915707Z - LAPACK is enabled (usually provided by MKL) 2025-12-04T09:26:10.0916039Z - NNPACK is enabled 2025-12-04T09:26:10.0916306Z - CPU capability usage: AVX2 2025-12-04T09:26:10.0916579Z - CUDA Runtime 12.8 2025-12-04T09:26:10.0916936Z - NVCC architecture flags: -gencode;arch=compute_86,code=sm_86 2025-12-04T09:26:10.0917348Z - CuDNN 91.0.2 (built against CUDA 12.9) 2025-12-04T09:26:10.0921861Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32, CUDA_VERSION=12.8, CUDNN_VERSION=9.10.2, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Werror -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=ON, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 2025-12-04T09:26:10.0926674Z 2025-12-04T09:26:10.4681007Z + cd test 2025-12-04T09:26:10.4681391Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2025-12-04T09:26:11.7965646Z ATen/Parallel: 2025-12-04T09:26:11.7966041Z at::get_num_threads() : 8 2025-12-04T09:26:11.7966412Z at::get_num_interop_threads() : 16 2025-12-04T09:26:11.7966801Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-12-04T09:26:11.7967125Z omp_get_max_threads() : 8 2025-12-04T09:26:11.7967667Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-12-04T09:26:11.7968245Z mkl_get_max_threads() : 8 2025-12-04T09:26:11.7968631Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-12-04T09:26:11.7969075Z std::thread::hardware_concurrency() : 16 2025-12-04T09:26:11.7969389Z Environment variables: 2025-12-04T09:26:11.7969653Z OMP_NUM_THREADS : [not set] 2025-12-04T09:26:11.7969929Z MKL_NUM_THREADS : [not set] 2025-12-04T09:26:11.7970646Z ATen parallel backend: OpenMP 2025-12-04T09:26:11.7970848Z 2025-12-04T09:26:12.1206034Z + [[ default == *numpy_2* ]] 2025-12-04T09:26:12.1206575Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *aarch64* ]] 2025-12-04T09:26:12.1207132Z + [[ default == *backward* ]] 2025-12-04T09:26:12.1207557Z + [[ default == *libtorch_agnostic_targetting* ]] 2025-12-04T09:26:12.1208003Z + [[ default == *xla* ]] 2025-12-04T09:26:12.1208372Z + [[ default == *vllm* ]] 2025-12-04T09:26:12.1208730Z + [[ default == *executorch* ]] 2025-12-04T09:26:12.1209134Z + [[ default == \j\i\t\_\l\e\g\a\c\y ]] 2025-12-04T09:26:12.1209516Z + [[ default == \q\u\a\n\t\i\z\a\t\i\o\n ]] 2025-12-04T09:26:12.1209977Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *libtorch* ]] 2025-12-04T09:26:12.1210572Z + [[ default == distributed ]] 2025-12-04T09:26:12.1210974Z + [[ default == *operator_benchmark* ]] 2025-12-04T09:26:12.1211385Z + [[ default == *operator_microbenchmark* ]] 2025-12-04T09:26:12.1211731Z + [[ default == *attention_microbenchmark* ]] 2025-12-04T09:26:12.1212161Z + [[ default == *inductor_distributed* ]] 2025-12-04T09:26:12.1212496Z + [[ default == *inductor-halide* ]] 2025-12-04T09:26:12.1212902Z + [[ default == *inductor-pallas* ]] 2025-12-04T09:26:12.1213342Z + [[ default == *inductor-triton-cpu* ]] 2025-12-04T09:26:12.1213786Z + [[ default == *inductor-micro-benchmark* ]] 2025-12-04T09:26:12.1214158Z + [[ default == *aoti_cross_compile_for_windows* ]] 2025-12-04T09:26:12.1214764Z + [[ default == *huggingface* ]] 2025-12-04T09:26:12.1215055Z + [[ default == *timm* ]] 2025-12-04T09:26:12.1215313Z + [[ default == cachebench ]] 2025-12-04T09:26:12.1215596Z + [[ default == verify_cachebench ]] 2025-12-04T09:26:12.1215901Z + [[ default == *torchbench* ]] 2025-12-04T09:26:12.1216196Z + [[ default == *inductor_cpp_wrapper* ]] 2025-12-04T09:26:12.1216514Z + [[ default == *inductor_core* ]] 2025-12-04T09:26:12.1216805Z + [[ default == *inductor* ]] 2025-12-04T09:26:12.1217080Z + [[ default == *einops* ]] 2025-12-04T09:26:12.1217355Z + [[ default == *dynamo_core* ]] 2025-12-04T09:26:12.1217644Z + [[ default == *dynamo_wrapped* ]] 2025-12-04T09:26:12.1218036Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *rocm* ]] 2025-12-04T09:26:12.1218416Z + [[ 4 == 1 ]] 2025-12-04T09:26:12.1218625Z + [[ 4 == 2 ]] 2025-12-04T09:26:12.1218842Z + [[ 4 -gt 2 ]] 2025-12-04T09:26:12.1219059Z + install_torchvision 2025-12-04T09:26:12.1219309Z + local orig_preload 2025-12-04T09:26:12.1219548Z + local commit 2025-12-04T09:26:12.1219768Z ++ get_pinned_commit vision 2025-12-04T09:26:12.1220053Z ++ cat .github/ci_commit_pins/vision.txt 2025-12-04T09:26:12.1232373Z + commit=617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:26:12.1232831Z + orig_preload= 2025-12-04T09:26:12.1233081Z + '[' -n '' ']' 2025-12-04T09:26:12.1233502Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck == *cuda* ]] 2025-12-04T09:26:12.1233906Z + export FORCE_CUDA=1 2025-12-04T09:26:12.1234219Z + FORCE_CUDA=1 2025-12-04T09:26:12.1234447Z + export WITH_CUDA=1 2025-12-04T09:26:12.1234694Z + WITH_CUDA=1 2025-12-04T09:26:12.1235318Z + pip_build_and_install git+https://github.com/pytorch/vision.git@617079d944b0e72632311c30ae2bbdf1168b901e dist/vision 2025-12-04T09:26:12.1236246Z + local build_target=git+https://github.com/pytorch/vision.git@617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:26:12.1236873Z + local wheel_dir=dist/vision 2025-12-04T09:26:12.1237153Z + local found_whl=0 2025-12-04T09:26:12.1237435Z + for file in "${wheel_dir}"/*.whl 2025-12-04T09:26:12.1237792Z + [[ -f dist/vision/*.whl ]] 2025-12-04T09:26:12.1238057Z + '[' 0 == 0 ']' 2025-12-04T09:26:12.1238774Z + python3 -m pip wheel --no-build-isolation --no-deps -w dist/vision git+https://github.com/pytorch/vision.git@617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:26:12.4478696Z Collecting git+https://github.com/pytorch/vision.git@617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:26:12.4482964Z Cloning https://github.com/pytorch/vision.git (to revision 617079d944b0e72632311c30ae2bbdf1168b901e) to /tmp/pip-req-build-zzh5gyos 2025-12-04T09:26:12.4632472Z Running command git clone --filter=blob:none --quiet https://github.com/pytorch/vision.git /tmp/pip-req-build-zzh5gyos 2025-12-04T09:26:14.1948467Z Running command git rev-parse -q --verify 'sha^617079d944b0e72632311c30ae2bbdf1168b901e' 2025-12-04T09:26:14.1976656Z Running command git fetch -q https://github.com/pytorch/vision.git 617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:26:14.3321142Z Resolved https://github.com/pytorch/vision.git to commit 617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:26:16.4593241Z Preparing metadata (pyproject.toml) ... [?25l- \ | done 2025-12-04T09:26:16.4626632Z [?25hBuilding wheels for collected packages: torchvision 2025-12-04T09:27:34.1714572Z Building wheel for torchvision (pyproject.toml) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - done 2025-12-04T09:27:34.1746000Z [?25h Created wheel for torchvision: filename=torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl size=1786263 sha256=91c5d39bd660ad3f94512618fed38901f25f4d363e085941ae57f8c5eeb10974 2025-12-04T09:27:34.1749082Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/12/b2/29/1f82685c5b5173629e1f36a9b93989ce92ce563e5fb91d27ac 2025-12-04T09:27:34.1783540Z Successfully built torchvision 2025-12-04T09:27:34.2897194Z + for file in "${wheel_dir}"/*.whl 2025-12-04T09:27:34.2897756Z + pip_install_whl dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl 2025-12-04T09:27:34.2898418Z + args=('dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl') 2025-12-04T09:27:34.2898863Z + local args 2025-12-04T09:27:34.2899262Z + [[ dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl == *\ * ]] 2025-12-04T09:27:34.2899735Z + for path in "${args[@]}" 2025-12-04T09:27:34.2900198Z + echo 'Installing dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl' 2025-12-04T09:27:34.2900887Z Installing dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl 2025-12-04T09:27:34.2901653Z + python3 -mpip install --no-index --no-deps dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl 2025-12-04T09:27:34.6227365Z Processing ./dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl 2025-12-04T09:27:34.6326073Z Installing collected packages: torchvision 2025-12-04T09:27:35.1179049Z Successfully installed torchvision-0.25.0a0+617079d 2025-12-04T09:27:35.1565988Z + '[' -n '' ']' 2025-12-04T09:27:35.1566303Z + test_python_shard 4 2025-12-04T09:27:35.1566561Z + [[ -z 8 ]] 2025-12-04T09:27:35.1567298Z + python test/run_test.py --exclude-jit-executor --exclude-distributed-tests --exclude-quantization-tests --shard 4 8 --verbose --upload-artifacts-while-running 2025-12-04T09:27:38.2564168Z Excluding doctests Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2565166Z Excluding test_meta Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2565858Z Excluding test_hub Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2566534Z Excluding test_fx Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2567209Z Excluding test_decomp Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2567997Z Excluding test_cpp_extensions_jit Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2568731Z Excluding test_jit Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2569435Z Excluding test_matmul_cuda Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2570130Z Excluding test_ops Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2571135Z Excluding test_ops_jit Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2571894Z Excluding dynamo/test_recompile_ux Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2572746Z Excluding inductor/test_compiled_optimizers Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2573611Z Excluding inductor/test_cutlass_backend Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2574454Z Excluding inductor/test_max_autotune Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2575308Z Excluding inductor/test_select_algorithm Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:38.2576121Z Excluding inductor/test_smoke Running in slow gradcheck mode, skipping tests that don't use gradcheck. 2025-12-04T09:27:40.2331501Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2025-12-04T09:27:40.2847361Z Ignoring disabled issues: [''] 2025-12-04T09:27:40.2952971Z Found test times from artifacts 2025-12-04T09:27:40.3358464Z Found test times from artifacts 2025-12-04T09:27:40.3371656Z Running all tests 2025-12-04T09:27:40.3997740Z Running parallel tests on 1 processes 2025-12-04T09:27:40.4001783Z Name: tests to run (est. time: 143.34min) 2025-12-04T09:27:40.4002463Z Serial tests (59): 2025-12-04T09:27:40.4002753Z inductor/test_aot_inductor 4/5 2025-12-04T09:27:40.4003134Z inductor/test_torchinductor_codegen_dynamic_shapes 3/4 2025-12-04T09:27:40.4003541Z inductor/test_torchinductor_opinfo 6/14 2025-12-04T09:27:40.4003893Z inductor/test_torchinductor_opinfo 14/14 2025-12-04T09:27:40.4004247Z inductor/test_compile_subprocess 1/2 2025-12-04T09:27:40.4004579Z inductor/test_deterministic 1/3 2025-12-04T09:27:40.4004872Z dynamo/test_sets 1/1 2025-12-04T09:27:40.4005162Z dynamo/test_compiler_bisector 1/1 2025-12-04T09:27:40.4005485Z dynamo/test_decorators 1/1 2025-12-04T09:27:40.4005761Z test_transformers 1/1 2025-12-04T09:27:40.4016159Z test_ops_fwd_gradients 3/12 2025-12-04T09:27:40.4016524Z test_ops_fwd_gradients 11/12 2025-12-04T09:27:40.4016814Z test_ops_gradients 7/10 2025-12-04T09:27:40.4017183Z test_linalg 1/1 2025-12-04T09:27:40.4017516Z functorch/test_ops 2/6 2025-12-04T09:27:40.4017857Z inductor/test_cpu_repro 2/2 2025-12-04T09:27:40.4018154Z dynamo/test_ctx_manager 1/1 2025-12-04T09:27:40.4018461Z inductor/test_cpu_select_algorithm 1/1 2025-12-04T09:27:40.4018818Z inductor/test_torchinductor_strided_blocks 1/1 2025-12-04T09:27:40.4019181Z inductor/test_multi_kernel 1/1 2025-12-04T09:27:40.4019482Z inductor/test_pad_mm 1/1 2025-12-04T09:27:40.4019767Z test_sparse_semi_structured 1/1 2025-12-04T09:27:40.4020082Z inductor/test_gpu_cpp_wrapper 1/1 2025-12-04T09:27:40.4020423Z inductor/test_move_constructors_to_gpu 1/1 2025-12-04T09:27:40.4020750Z dynamo/test_list 1/1 2025-12-04T09:27:40.4021014Z dynamo/test_resume 1/1 2025-12-04T09:27:40.4021308Z inductor/test_augmented_graph_helper 1/1 2025-12-04T09:27:40.4021648Z dynamo/test_deviceguard 1/1 2025-12-04T09:27:40.4021926Z dynamo/test_sources 1/1 2025-12-04T09:27:40.4022226Z dynamo/test_backward_higher_order_ops 1/1 2025-12-04T09:27:40.4022558Z dynamo/test_optimizers 1/1 2025-12-04T09:27:40.4022964Z inductor/test_custom_partitioner_fn 1/1 2025-12-04T09:27:40.4023293Z dynamo/test_debug_utils 1/1 2025-12-04T09:27:40.4023573Z dynamo/test_base_hop 1/1 2025-12-04T09:27:40.4023839Z dynamo/test_export 1/1 2025-12-04T09:27:40.4024130Z dynamo/test_python_dispatcher 1/1 2025-12-04T09:27:40.4024693Z export/test_swap 1/1 2025-12-04T09:27:40.4024958Z export/test_unflatten 1/1 2025-12-04T09:27:40.4025254Z dynamo/test_verify_correctness 1/1 2025-12-04T09:27:40.4025571Z inductor/test_fxir_backend 1/1 2025-12-04T09:27:40.4025864Z dynamo/test_flat_apply 1/1 2025-12-04T09:27:40.4026367Z dynamo/test_input_attr_tracking 1/1 2025-12-04T09:27:40.4026695Z dynamo/test_graph_deduplication 1/1 2025-12-04T09:27:40.4027034Z inductor/test_distributed_patterns 1/1 2025-12-04T09:27:40.4027356Z dynamo/test_structured_trace 1/1 2025-12-04T09:27:40.4027667Z dynamo/test_fake_distributed 1/1 2025-12-04T09:27:40.4027968Z export/test_nativert 1/1 2025-12-04T09:27:40.4028239Z export/test_hop 1/1 2025-12-04T09:27:40.4028505Z test_model_exports_to_core_aten 1/1 2025-12-04T09:27:40.4028828Z dynamo/test_precompile_context 1/1 2025-12-04T09:27:40.4029130Z dynamo/test_trace_rules 1/1 2025-12-04T09:27:40.4029413Z export/test_upgrader 1/1 2025-12-04T09:27:40.4029689Z dynamo/test_hooks 1/1 2025-12-04T09:27:40.4029946Z export/test_sparse 1/1 2025-12-04T09:27:40.4030204Z test_modules 1/4 2025-12-04T09:27:40.4030478Z complex_tensor/test_complex_tensor 1/1 2025-12-04T09:27:40.4030814Z benchmark_utils/test_benchmark_utils 1/1 2025-12-04T09:27:40.4031187Z torch_np/numpy_tests/core/test_scalarmath 1/1 2025-12-04T09:27:40.4031531Z nn/test_convolution 2/2 2025-12-04T09:27:40.4031803Z Parallel tests (0): 2025-12-04T09:27:40.4032062Z Name: excluded (est. time: 0.0min) 2025-12-04T09:27:40.4032352Z Serial tests (0): 2025-12-04T09:27:40.4032594Z Parallel tests (0): 2025-12-04T09:27:40.4033002Z Running inductor/test_aot_inductor 4/5 ... [2025-12-04 09:27:40.400662][939.298231394] 2025-12-04T09:27:40.4033607Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T09:27:40.4034616Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '--shard-id=4', '--num-shards=5', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:27:40.401040] 2025-12-04T09:35:20.0400242Z 2025-12-04T09:35:20.0401374Z inductor/test_aot_inductor 4/5 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_4.5_12641f387b8fb16f_.log 2025-12-04T09:35:20.0492405Z Running 192 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_amp_fallback_random_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_sym_inputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printing_model_inputs_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_runtime_asserts_backed_symint_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_runtime_asserts_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_user_defined_triton_kernel_profiling_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_assert_async_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_autotune_with_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_and_force_mmap_weights_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_unbacked_symint_closure_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_unbacked_symint_closure_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_use_buffers_from_outer_scope_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_replace_view_ops_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_original_fqn_and_dtype_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_type_propagation_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_copy_non_blocking_is_pinned_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_smem_above_default_limit_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_embedding_bag_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_empty_cat_dtype_promotion_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_empty_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_empty_graph_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fallback_kernel_with_symexpr_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fx_gm_return_tuple_validation_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_input_codegen_with_sympy_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_grid_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_mixed_device_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_nan_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_contiguous_output_alias_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_default_gpu_device_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeated_user_defined_triton_kernel_embed_kernel_binary_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sdpa_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_with_unbacked_add_expr_transitive_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_small_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_stft_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sym_i64_input_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sympy_cpp_printer_min_max_minmax0_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_multi_output_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_on_device_tma_dynamic_False_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_unbacked_expr_replacements_shift_k_0_use_static_size_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_unbacked_expr_replacements_shift_k_2_use_static_size_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_user_managed_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_outer_buffers_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_outer_code_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_with_no_triton_profiler_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__int_mm_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aot_inductor_consts_cpp_build_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_codegen_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_profiler_enable_kernel_profile_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_user_defined_triton_kernel_profiling_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_backward_no_op_logging_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_and_force_mmap_weights_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_composed_dynamic_size_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_share_predicate_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_outer_code_before_after_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_parameters_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_deconv_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_empty_constant_folding_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fallback_mem_leak_fix_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fp8_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fqn_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_int_list_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_dynamic_dim_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_grid_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_linear_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_load_package_multiple_gpus_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misc_1_max_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misc_1_max_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_missing_cubin_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_multi_device_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_multiple_output_alias_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_narrow_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_nested_tensor_from_jagged_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_non_contiguous_output_alias_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_non_default_gpu_device_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_output_misaligned_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_output_path_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_profile_benchmark_harness_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_proxy_executor_abs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_quantized_linear_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeat_interleave_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeat_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_return_constant_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_fp8_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_large_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_same_backing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_seq_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_multi_arch_embed_kernel_binary_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_symint_item_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sympy_cpp_printer_min_max_minmax0_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sympy_cpp_printer_min_max_minmax1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_dynamic_grid_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_equal_to_1_float_arg_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_on_device_tma_dynamic_False_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_on_device_tma_dynamic_True_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_reinterpret_view_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbacked_expr_replacements_shift_k_0_use_static_size_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbacked_expr_replacements_shift_k_3_use_static_size_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_outer_buffers_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_pytree_inputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_sym_expr_cond_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_no_triton_profiler_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_addmm_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_codegen_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_fp8_dtype_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_sym_inputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_autotuning_args_reuse_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_bool_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_boolean_indexing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_3_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_composed_dynamic_size_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_use_buffers_from_outer_scope_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_multiple_outputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_reinterpret_view_inputs_outputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_folding_with_update_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_conv3d_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_conv_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_deconv_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_device_moved_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dynamic_scalar_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dynamic_smem_above_default_limit_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_empty_cat_dtype_promotion_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fp8_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fp8_view_of_param_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_index_put_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_input_codegen_with_sympy_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_linear_dynamic_maxautotune_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misc_1_max_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_missing_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_model_modified_weights_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_multi_device_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_narrow_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_nested_tensor_from_jagged_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_quanatized_int8_linear_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_return_view_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_reuse_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_fp8_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_large_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scaled_grouped_mm_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_True_max_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_split_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_with_unbacked_add_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_with_unbacked_add_expr_transitive_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_stride_with_unbacked_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sym_i64_input_codegen_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_symfloat_item_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_bool_param_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_dynamic_grid_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_dynamic_shape_with_div_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_equal_to_1_float_arg_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_extern_kernel_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_True_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_reinterpret_view_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_sympy_expr_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_inactive_constant_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_view_outputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_weight_on_disk_legacy_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_nested_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_simple_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_mixed_device_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_mixed_device_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_parameters_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_no_triton_profiler_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_zero_size_weight_mps 2025-12-04T09:35:20.0573386Z 2025-12-04T09:35:20.0573695Z Finished inductor/test_aot_inductor 4/5 ... [2025-12-04 09:35:20.039482][1398.937051069], took 7.66min 2025-12-04T09:35:20.0574729Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-b20db2ea014300dc.xml 2025-12-04T09:35:20.4888879Z Uploading artifacts took 0.14 seconds 2025-12-04T09:35:20.4891738Z Running inductor/test_torchinductor_codegen_dynamic_shapes 3/4 ... [2025-12-04 09:35:20.488819][1399.386386007] 2025-12-04T09:35:20.4892323Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T09:35:20.4895958Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_codegen_dynamic_shapes.py', '--shard-id=3', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:35:20.489227] 2025-12-04T09:42:54.4656883Z 2025-12-04T09:42:54.4658345Z inductor/test_torchinductor_codegen_dynamic_shapes 3/4 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_3.4_eaaf38f1800870c1_.log 2025-12-04T09:42:54.4900795Z Running 447 items in this shard: test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_pool_errors_with_long_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex9_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_addmv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aliased_buffer_reuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_allow_reuse_active_if_under_peak_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_argmin3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_argmin_with_duplicates_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_min_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool_errors_with_uint_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_baddbmm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bernoulli1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bernoulli2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bfloat16_to_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bmm2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_both_scalars_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_computed_offsets_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_default_kwargs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int32_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int64_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int64_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_uint8_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_uint8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_nd_tiling_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_buffer_copied_in_graph_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_int_ndigits_pos_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_int_ndigits_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_empty_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_unbacked_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_chunk_recompiles_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_clamp_type_promotion_non_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_complex_memory_overlap_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_config_option_dont_assume_alignment_recompiles_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_consecutive_split_cumprod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_fill_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_nd_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv2d_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv3d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_convolution4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_copy_non_blocking_is_pinned_use_cat_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cos_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cpu_scalar_with_gpu_tensor_cpp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cumsum_no_mask_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cumsum_pattern_matcher_issue_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_op_default_layout_constraint_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_scan_would_split_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_deterministic_codegen_with_suffix_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_diagonal_copy_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dist_bf16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div7_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dropout_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dropout_trivial_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtype_mismatch_issue_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_empty_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_exact_stride_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fallback_mutable_op_basic_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fallback_mutable_op_list_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_float32_to_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fractional_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fractional_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_full_like_transposed_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fuse_tiled_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gather1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gather_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gelu_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_constant_tensor2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_mutation_real_name_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_no_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_unbacked_symint_as_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_grid_sampler_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_grid_sampler_expand_preserves_view_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_hardtanh_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_horizonal_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_horizonal_fusion2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_as_masked_fill_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_select_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_indirect_load_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inductor_multiple_specializations_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_input_mutation2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_input_mutation5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_isin_tensor_scalar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_l1_loss_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_grid_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_pointwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_lerp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_like_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_like_rands_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linear1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linear_dynamic_maxautotune_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linspace1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_logaddexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_logsumexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mark_unbacked_with_hint_override_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_masked_fill_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_masked_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_matmul_layer_norm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_min_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d7_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mixed_mm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mul_index_expr_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multi_gpu_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multilayer_prime_size_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multilayer_var_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multilayer_var_lowp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mutable_custom_op_fixed_layout_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_new_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_no_mega_fusion_during_lowering_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nonzero_unbacked_refinement_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_output_strides_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pattern_matcher_multi_user_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_permute2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_bessel_j1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_chebyshev_polynomial_t_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_chebyshev_polynomial_v_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_digamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_erfcx_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_erfinv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_expit_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_gammainc_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_i1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_laguerre_polynomial_l_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_legendre_polynomial_p_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_log1p_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_modified_bessel_k0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_psi_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_shifted_chebyshev_polynomial_u_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_xlog1py_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow_by_natural_log2_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow_symfloat_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_profiler_mark_wrapper_call_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randint_kernel_count_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randn_generator_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randn_like_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reflection_pad2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_clone_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_copy_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_slice_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_Tensor_decomp_int32_nd_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_Tensor_decomp_int64_nd_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reuse_buffers_with_aliasing_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scalar_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scaled_dot_product_attention_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_add1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_reduce1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_reduce2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_searchsorted_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_searchsorted_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sgn_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_shape_prop_torch_ones_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_silu_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_simplify_loops_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_single_elem_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sizehint_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter_dtype_consistency_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_view_with_graph_break_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sort_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_special_polygamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumsum_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_reduction_dynamic_shape_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_integer_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sqrt_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum_keepdims_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tensor3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tmp_not_defined_issue1_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tmp_not_defined_issue2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_to_device_constant_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_to_memory_format_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_transpose_add_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_bilinear2d_a_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_bilinear2d_b_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_cat_conv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_nearest3d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_var_mean_div_by_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_vertical_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_view_as_real_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_view_uint8_through_differing_bitwidths_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_zero_dim_reductions_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_zeros_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test__dyn_quant_matmul_4bit_bf16_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool1d_argmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_complex6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_complex8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adding_tensor_offsets_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_addmv_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_allow_reuse_disable_if_exceed_peak_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_angle_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_support_str_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin_with_duplicates_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin_with_nan_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_min_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_to_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_alignment_op_name_pass_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d7_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool3d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool_errors_with_uint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_baddbmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_batch_norm_2d_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bernoulli1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bitwise2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bitwise3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_add_autotune_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int64_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int64_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int8_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_uint8_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_nd_tiling_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_builtins_round_float_ndigits_neg_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_builtins_round_int_ndigits_pos_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_unbacked_empty_1d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_clamp_type_promotion_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_complex_from_real_imag_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_complex_memory_overlap_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_config_option_dont_assume_alignment_cudagraphs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv2d_backward_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv3d_channels_last_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_convolution5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cos_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cpu_scalar_with_gpu_tensor_cpp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cpu_scalar_with_gpu_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cpu_tensor_with_cpu_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cpu_tensor_with_gpu_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_pattern_matcher_issue_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_fixed_layout_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_softmax_symfloat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout_deterministic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtype_sympy_expr_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_embedding_bag_byte_unpack_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_erfc_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expanded_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expm1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fallback_mutable_op_basic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_flexible_layout_immutable_free_symbols_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_flip_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_forced_buffer_realize_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fractional_max_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fractional_max_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_boolean_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_like_sliced_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_like_transposed_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_truncation_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_functionalize_rng_wrappers_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fuse_tiled_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_both_scalars_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_constant_tensor1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_scalar_inputs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_hardtanh_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_horizonal_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_propagation_nested_indirect_indexing_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_as_masked_fill_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_failed_reinplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_reinplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inductor_layout_optimization_input_mutations_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inner_reduction_detection_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_activations_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_isin_tensor_scalar_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_isinf_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_layer_norm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linspace3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_lite_regional_compile_repeated_blocks_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_log1p_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_logcumsumexp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_logcumsumexp_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_long_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mark_unbacked_with_hint_override_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_masked_fill_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_min_max_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mix_device_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mm_views_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mul_softmax_symfloat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multi_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multi_gpu_recompile_on_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_any_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_sum_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_var_lowp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mutable_custom_op_fixed_layout_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mutations_loop_fusion_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nan_to_num_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_neg_max_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_no_mega_fusion_during_lowering_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_norm_constant_overflow_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_one_hot_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_output_strides_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pad_cast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pad_single_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_permute2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_airy_ai_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_chebyshev_polynomial_u_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_erf_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_erfcx_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_exp2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_gammainc_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_laguerre_polynomial_l_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_modified_bessel_i1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_modified_bessel_k1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_shifted_chebyshev_polynomial_u_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_sinc_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_spherical_bessel_j0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_xlogy_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_prod_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction_config_limit_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reflection_pad2d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reinterpret_dtypeview_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_copy_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_slice1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_slice_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_view_default_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_replication_pad_errors_with_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_roi_align_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_round_correctness_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_round_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_add1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_unaligned_mask_freezing_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_setitem_with_int_parameter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_shape_padding_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_should_pad_bench_for_bmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sign_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_mutation1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sort_transpose_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_cumsum_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_reduction_dynamic_shape_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_with_sizes_with_unbacked_symints_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_with_unbacked_symints_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_strided_inputs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum_keepdims_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor_index_put_slice_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_to_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_to_memory_format_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_topk_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_uint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unbind_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_bicubic2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_cat_conv_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_vectorized_ops_masked_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_vertical_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_xblock_divides_xnumel_dynamic_shapes_cuda 2025-12-04T09:42:54.5133184Z 2025-12-04T09:42:54.5133584Z Finished inductor/test_torchinductor_codegen_dynamic_shapes 3/4 ... [2025-12-04 09:42:54.466648][1853.364216836], took 7.57min 2025-12-04T09:42:54.5134919Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_torchinductor_codegen_dynamic_shapes/inductor.test_torchinductor_codegen_dynamic_shapes-199c8a5f5951ffa6.xml 2025-12-04T09:42:54.5604181Z Running inductor/test_torchinductor_opinfo 6/14 ... [2025-12-04 09:42:54.559997][1853.457564541] 2025-12-04T09:42:54.5604704Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T09:42:54.5607402Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '--shard-id=6', '--num-shards=14', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:42:54.560349] 2025-12-04T09:54:56.4233256Z 2025-12-04T09:54:56.4234656Z inductor/test_torchinductor_opinfo 6/14 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_6.14_65e2e2b656ebd086_.log 2025-12-04T09:54:56.4353569Z Running 263 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bincount_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_xor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdist_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chalf_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_copysign_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_copysign_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_corrcoef_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_embed_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dist_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_strided_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eq_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flip_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ge_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gradient_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_grid_sampler_2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hash_tensor_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_histc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isposinf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kron_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kthvalue_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lerp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_solve_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_solve_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_qr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_tensorsolve_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vecdot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vecdot_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log10_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logcumsumexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_sum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mul_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_multinomial_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_ones_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_zeros_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cross_entropy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gelu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_glu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardtanh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_huber_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_linear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_rrelu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_silu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pinverse_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_qr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_quantile_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rand_like_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_neg_3_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsub_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_add_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_add_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_scatter_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapezoid_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_int64 2025-12-04T09:54:56.4468595Z 2025-12-04T09:54:56.4468940Z Finished inductor/test_torchinductor_opinfo 6/14 ... [2025-12-04 09:54:56.423749][2575.321316878], took 12.03min 2025-12-04T09:54:56.4470243Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-97012a12d7277b61.xml 2025-12-04T09:54:56.5109161Z Running inductor/test_torchinductor_opinfo 14/14 ... [2025-12-04 09:54:56.510525][2575.408093688] 2025-12-04T09:54:56.5109757Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T09:54:56.5112455Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '--shard-id=14', '--num-shards=14', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:54:56.510873] 2025-12-04T10:05:37.7859050Z 2025-12-04T10:05:37.7860883Z inductor/test_torchinductor_opinfo 14/14 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_14.14_73c797de8513cfc8_.log 2025-12-04T10:05:37.7980172Z Running 268 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rpow___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__native_batch_norm_legit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acos_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcmul_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_arange_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_char_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_char_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_copysign_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_copysign_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_corrcoef_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dist_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_like_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eq_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exponential_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fill_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flatten_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gcd_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geqrf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hash_tensor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_heaviside_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hypot_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_put_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isclose_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isclose_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isclose_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isfinite_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kron_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kthvalue_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_inv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_factor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_factor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_pinv_singular_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_triangular_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log1p_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logit_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_fill_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_fill_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_std_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_sum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matmul_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_pool2d_with_indices_backward_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_with_dim_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_maximum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gelu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_linear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_grad_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu6_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_silu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softshrink_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pow_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rand_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_neg_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_neg_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_add_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_prod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_scatter_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_general_cosine_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_nuttall_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sort_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_mm_reduce_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_to_size_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_to_size_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_along_dim_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapz_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_mean_unbiased_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_bool 2025-12-04T10:05:37.8106501Z 2025-12-04T10:05:37.8106848Z Finished inductor/test_torchinductor_opinfo 14/14 ... [2025-12-04 10:05:37.786045][3216.683611346], took 10.69min 2025-12-04T10:05:37.8108008Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-4ca7743e44bf290c.xml 2025-12-04T10:05:38.0320940Z Uploading artifacts took 0.16 seconds 2025-12-04T10:05:38.0325114Z Running inductor/test_compile_subprocess 1/2 ... [2025-12-04 10:05:38.032155][3216.929722585] 2025-12-04T10:05:38.0325625Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T10:05:38.0330023Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compile_subprocess.py', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:05:38.032601] 2025-12-04T10:13:51.0592867Z 2025-12-04T10:13:51.0596681Z PRINTING LOG FILE of inductor/test_compile_subprocess 1/2 (test/test-reports/inductor.test_compile_subprocess_1.2_f3ccfd39c19f4fed_.log) 2025-12-04T10:13:51.0598303Z Test results will be stored in test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-9265b0e9898bf9b1.xml 2025-12-04T10:13:51.0601295Z ============================= test session starts ============================== 2025-12-04T10:13:51.0602074Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:13:51.0602820Z cachedir: .pytest_cache 2025-12-04T10:13:51.0603682Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:13:51.0604468Z rootdir: /var/lib/jenkins/workspace 2025-12-04T10:13:51.0604884Z configfile: pytest.ini 2025-12-04T10:13:51.0605697Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:13:51.0606500Z collecting ... collected 897 items 2025-12-04T10:13:51.0606978Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T10:13:51.0808740Z Running 433 items in this shard: test/inductor/test_compile_subprocess.py::TestSubprocess::test_progressive, test/inductor/test_compile_subprocess.py::GPUTests::test_AllenaiLongformerBase_repro_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test__dyn_quant_matmul_4bit_bf16_input_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test__dyn_quant_matmul_4bit_fp32_input_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test__dyn_quant_pack_4bit_weight_fp32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test__unsafe_masked_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_abs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool1d_argmax_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool2d2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool_errors_with_long_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool_with_output_size_0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_max_pool2d1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_max_pool2d3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adaptive_pool_errors_with_long_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex10_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_complex8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_add_inplace_permuted_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_adding_tensor_offsets_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_addmm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_addmv_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_alexnet_prefix_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_allow_reuse_disable_if_exceed_peak_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_angle_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_override_registration_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_support_out_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_with_scalar_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_arange4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_argmax_argmin1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_argmax_argmin2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_argmax_argmin3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_argmax_to_float_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_as_strided_scatter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_assert_alignment_op_name_fail_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_assert_size_stride_op_name_pass_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d6_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d_backward3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d_backward4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool3d_backward2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool3d_backward3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_avg_pool3d_backward4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_baddbmm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bernoulli1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bernoulli2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bitwise2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bitwise_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bmm1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bool_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_both_scalars_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_broadcast_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_computed_offsets_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int16_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int32_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int32_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int32_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int64_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int64_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_uint8_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_uint8_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_bucketize_nd_tiling_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_buffer_copied_in_graph_with_different_shapes_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_buffer_use_after_remove_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_float_ndigits_neg_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_float_ndigits_zero_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_int_ndigits_pos_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_int_ndigits_zero_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_inplace_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_negative_dim_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_single_empty_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_unbacked_2d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cat_unbacked_legacy_empty_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_chunk_recompiles_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_clamp_type_promotion_non_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_compar_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_complex_fallback_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_complex_from_real_imag_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_complex_memory_overlap_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_config_option_dont_assume_alignment_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_const_int32_to_float_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_1d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_2d_strides_nonpositive_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_fill_dtype_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv1d_with_permute_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv3d_channels_last_use_block_ptr_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv3d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv_backward_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv_bn_fuse_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv_inference_heuristics_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv_shape_check_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_conv_with_as_strided_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_convolution1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_convolution3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_convolution4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_copy_with_scalar_src_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cpu_scalar_with_gpu_tensor_dynamic_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cpu_tensor_with_gpu_tensor_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cumsum_inf_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_cumsum_pattern_matcher_issue_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_default_layout_constraint_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_fixed_layout_channels_last_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_fixed_layout_sequential_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_op_unbacked_symints_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_scan_op_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_custom_scan_op_multi_input_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_data_type_propogation_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dense_mask_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_deterministic_codegen_with_suffix_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_diagonal_copy_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div9_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div_by_zero_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div_precision_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div_presicion_accuracy_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div_prim_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_div_softmax_symfloat_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dont_constant_fold_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dropout_trivial_1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtype_mismatch_issue_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtype_sympy_expr_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float16_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_fusion_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_float16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_bfloat16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_uint8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_elu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_embedding_bag_byte_unpack_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_embedding_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_empty_strided_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_exact_stride_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_exp2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_exp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_expand_as_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_expand_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fft_real_input_real_output_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fill2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_flexible_layout_immutable_free_symbols_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_flip_cat_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_flip_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_float_repr_dynamic_shapes_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_floordiv_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_forced_buffer_realize_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fractional_max_pool2d3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_fractional_max_pool2d5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_full_boolean_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_full_like_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_functionalize_rng_wrappers_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_gather1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_gather_scatter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_gelu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_getitem_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_arange2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_misaligned_input_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_mutation_real_name_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_scalar_inputs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_hardtanh_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_horizonal_fusion1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_as_masked_fill_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_deterministic_fallback_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_failed_reinplace_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_put_reinplace_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_index_remainder_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_indirect_load_broadcast_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inf_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inner_reduction_detection_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inplace_add_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inplace_flip_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_inplace_resize_as_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_input_mutation2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_input_mutation5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_insignificant_strides_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_int8_weight_only_quant_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_isin_tensor_scalar_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_issue102546_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_kernel_names_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_kwargs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_l1_loss_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_large_block_sizes_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_large_broadcast_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_large_offset_pointwise_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_large_tensor_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linalg_eig_stride_consistency_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linspace2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_linspace4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_list_clearing_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_lite_mode_not_decompose_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_lite_triton_kernel_wrapper_functional_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_log2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_log_fp64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_logaddexp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_logcumsumexp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_logcumsumexp_zero_dim_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_logsumexp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_low_memory_max_pool_dilation_1_dim_3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_low_memory_max_pool_dilation_2_dim_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_low_memory_max_pool_dilation_2_dim_3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_masked_fill_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_masked_scatter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d6_dilation_1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward6_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mean_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_min_max_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_misaligned_address_issue1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mix_device_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mixed_mm3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mixed_mm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_move_arange_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mul_index_expr_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_mul_softmax_symfloat_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multi_gpu_recompile_on_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multi_threading_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multilayer_prime_size_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_multilayer_var_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_nan_sort_stable_True_descending_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_nan_sort_stable_True_descending_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_needs_contiguous_strides_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_neg_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_new_empty_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_new_empty_strided_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_new_ones_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_no_mega_fusion_during_lowering_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_no_op_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_norm_constant_overflow_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_one_hot_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pad_single_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pattern_matcher_unbacked_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_permute1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_philox_rand_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_j0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_j1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_y0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_y1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_chebyshev_polynomial_t_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_chebyshev_polynomial_u_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_digamma_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_entr_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_erfc_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_erfinv_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_exp2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_expm1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_gammainc_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_gammaln_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_hermite_polynomial_h_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_laguerre_polynomial_l_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_legendre_polynomial_p_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_log1p_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_i0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_i1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_k0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_k1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_ndtri_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_polygamma_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_round_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_scaled_modified_bessel_k0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_scaled_modified_bessel_k1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_sinc_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_spherical_bessel_j0_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_xlog1py_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_xlogy_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pointwise_zeta_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pow3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pow_int_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_pow_symfloat_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_prod_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_profiler_mark_wrapper_call_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_rand_like_deterministic_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_randint_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_randn_generator_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reduction1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reduction3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reduction_config_limit_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reflection_pad2d_backward_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reflection_pad2d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reinterpret_dtypeview_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_relu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_no_ops_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_slice_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_view_default_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_view_dtype_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_as_strided_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_reuse_buffers_with_aliasing_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_rsqrt_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_rsqrt_dynamic_shapes_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter6_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter_add1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter_bf16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter_reduce1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_scatter_reduce3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_searchsorted_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_select_scatter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_setitem_with_int_parameter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sgn_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sgn_extremal_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_shape_padding_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_silu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_size_asserts_for_multi_output_fallback_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_mutation3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter_dtype_consistency_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_slice_view_with_graph_break_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_softmax_backward_data_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_softmax_one_kernel_loop_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sort_bool_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sort_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sort_stable_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sort_transpose_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_special_polygamma_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_cumprod_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_cumsum_low_prec_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_failed_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_with_list_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_split_with_sizes_with_unbacked_symints_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_std_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_stride_preservation_with_stride_modifying_fx_pass_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_strided_inputs_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum_dtype_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum_int_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_sum_keepdims_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tanh_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tensor1_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tensor2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tensor3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tensor_index_put_slice_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tmp_not_defined_issue1_use_block_ptr_True_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tmp_not_defined_issue2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_tmp_not_defined_issue3_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_to_device_constant_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_to_memory_format_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_topk_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_transpose_add_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_transposed_propagates_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_triton_argmin_argmax_transpose_logical_index_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_triton_kernel_bool_param_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_triu_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_uint4x2_mixed_mm_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unbacked_float_item_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unbacked_floordiv_simplify_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unbacked_floordiv_simplify_errors_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unroll_small_reduction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_float32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_float64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int16_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int32_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int64_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int8_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_unsqueeze_inplace_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_upsample_bilinear2d_a_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_upsample_cat_conv_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_upsample_nearest1d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_var_correction_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_var_mean_tile_reduction_False_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_vdd_clamp_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_view_as_complex_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_view_as_real_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_view_detach_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_views2_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_views4_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_views5_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_views6_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_weight_norm_bwd_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_weight_norm_conv2d_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_where_broadcast_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_where_with_logical_op_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_zero_dim_reductions_cuda, test/inductor/test_compile_subprocess.py::GPUTests::test_zero_element_mutation_cuda 2025-12-04T10:13:51.1011887Z 2025-12-04T10:13:51.1012469Z inductor/test_compile_subprocess.py::TestSubprocess::test_progressive PASSED [13.0857s] [ 0%] 2025-12-04T10:13:51.1014020Z inductor/test_compile_subprocess.py::GPUTests::test_AllenaiLongformerBase_repro_cuda <- test/inductor/test_torchinductor.py PASSED [2.2221s] [ 0%] 2025-12-04T10:13:51.1016597Z inductor/test_compile_subprocess.py::GPUTests::test__dyn_quant_matmul_4bit_bf16_input_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0030s] (No _dyn_quant_matmul_4bit implementation on CUDA) [ 0%] 2025-12-04T10:13:51.1019069Z inductor/test_compile_subprocess.py::GPUTests::test__dyn_quant_matmul_4bit_fp32_input_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0026s] (No _dyn_quant_matmul_4bit implementation on CUDA) [ 0%] 2025-12-04T10:13:51.1021516Z inductor/test_compile_subprocess.py::GPUTests::test__dyn_quant_pack_4bit_weight_fp32_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0025s] (No _dyn_quant_pack_4bit_weight implementation on CUDA) [ 1%] 2025-12-04T10:13:51.1023693Z inductor/test_compile_subprocess.py::GPUTests::test__unsafe_masked_index_cuda <- test/inductor/test_torchinductor.py PASSED [0.9058s] [ 1%] 2025-12-04T10:13:51.1025462Z inductor/test_compile_subprocess.py::GPUTests::test_abs_cuda <- test/inductor/test_torchinductor.py PASSED [0.7946s] [ 1%] 2025-12-04T10:13:51.1027162Z inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool1d_argmax_cuda <- test/inductor/test_torchinductor.py PASSED [0.7258s] [ 1%] 2025-12-04T10:13:51.1028922Z inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool2d2_cuda <- test/inductor/test_torchinductor.py PASSED [0.1127s] [ 2%] 2025-12-04T10:13:51.1030716Z inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool_errors_with_long_cuda <- test/inductor/test_torchinductor.py PASSED [0.6475s] [ 2%] 2025-12-04T10:13:51.1032597Z inductor/test_compile_subprocess.py::GPUTests::test_adaptive_avg_pool_with_output_size_0_cuda <- test/inductor/test_torchinductor.py PASSED [0.4420s] [ 2%] 2025-12-04T10:13:51.1035035Z inductor/test_compile_subprocess.py::GPUTests::test_adaptive_max_pool2d1_cuda <- test/inductor/test_torchinductor.py W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1037224Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1039140Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1040926Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1042813Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1044739Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1046598Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1048234Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1049924Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1051863Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1053903Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1055959Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1057752Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1059575Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1061290Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1063221Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1064589Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1065754Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1066907Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1068079Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1069258Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1070480Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1071826Z W1204 10:06:06.975000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.1073467Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1074979Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1076885Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1078621Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1079741Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1081064Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1082964Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1084599Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1086328Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1087747Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1088936Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1090047Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1091138Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1092279Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1093423Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1094574Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1095759Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1096933Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1098093Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1099268Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1100468Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1101699Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1103266Z W1204 10:06:07.374000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.1104229Z PASSED [4.6353s] [ 2%] 2025-12-04T10:13:51.1105252Z inductor/test_compile_subprocess.py::GPUTests::test_adaptive_max_pool2d3_cuda <- test/inductor/test_torchinductor.py W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1106600Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1107779Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1108881Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1109947Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1111174Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1112435Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1113477Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1114571Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1115776Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1116963Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1118067Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1119145Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1120285Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1121436Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1122579Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1123714Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1125118Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1126280Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1127602Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1128796Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1130020Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1131386Z W1204 10:06:10.787000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.1132648Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1133528Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1134689Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1135785Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1136962Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1138176Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1139358Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1140404Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1141481Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1142686Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1143976Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1145084Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1146167Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1147319Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1148470Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1149634Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1150779Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1152035Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1153202Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1154388Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1155587Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1156819Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1158172Z W1204 10:06:11.177000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.1159124Z PASSED [4.0292s] [ 3%] 2025-12-04T10:13:51.1159797Z inductor/test_compile_subprocess.py::GPUTests::test_adaptive_pool_errors_with_long_cuda <- test/inductor/test_torchinductor.py PASSED [0.6065s] [ 3%] 2025-12-04T10:13:51.1160945Z inductor/test_compile_subprocess.py::GPUTests::test_add_complex10_cuda <- test/inductor/test_torchinductor.py PASSED [0.6032s] [ 3%] 2025-12-04T10:13:51.1161930Z inductor/test_compile_subprocess.py::GPUTests::test_add_complex3_cuda <- test/inductor/test_torchinductor.py PASSED [0.3271s] [ 3%] 2025-12-04T10:13:51.1162927Z inductor/test_compile_subprocess.py::GPUTests::test_add_complex8_cuda <- test/inductor/test_torchinductor.py PASSED [0.2706s] [ 3%] 2025-12-04T10:13:51.1163953Z inductor/test_compile_subprocess.py::GPUTests::test_add_inplace_permuted_cuda <- test/inductor/test_torchinductor.py PASSED [0.5926s] [ 4%] 2025-12-04T10:13:51.1165006Z inductor/test_compile_subprocess.py::GPUTests::test_adding_tensor_offsets_cuda <- test/inductor/test_torchinductor.py PASSED [0.2821s] [ 4%] 2025-12-04T10:13:51.1166210Z inductor/test_compile_subprocess.py::GPUTests::test_addmm_cuda <- test/inductor/test_torchinductor.py PASSED [0.6570s] [ 4%] 2025-12-04T10:13:51.1188733Z inductor/test_compile_subprocess.py::GPUTests::test_addmv_cuda <- test/inductor/test_torchinductor.py W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1190152Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1191318Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1192402Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1193464Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1194679Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1195847Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1196883Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1198218Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1199403Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1200573Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1201666Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1203207Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1204342Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1205508Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1206669Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1207910Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1209073Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1210231Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1211401Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1212581Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1213813Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1215126Z W1204 10:06:15.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.1216082Z PASSED [0.8291s] [ 4%] 2025-12-04T10:13:51.1217158Z inductor/test_compile_subprocess.py::GPUTests::test_alexnet_prefix_cuda <- test/inductor/test_torchinductor.py W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1218453Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1219624Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1220707Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1221771Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1223160Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1224668Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1225788Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1226872Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1228064Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1229257Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1230363Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1231445Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1232725Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1233866Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1235014Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1236150Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1237310Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1238478Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1239646Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1240838Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1242059Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1243405Z W1204 10:06:16.054000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.1244670Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1245545Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1246816Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1247904Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1248965Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1250188Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1251369Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1252411Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1253498Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1254694Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1255972Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1257077Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1258159Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1259304Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1260443Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1261583Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1262733Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1263963Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1265130Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1266303Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1267489Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1268722Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1270069Z W1204 10:06:16.731000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.1271416Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1272289Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1273441Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1274535Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1275594Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1276812Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1277978Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1279019Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1280213Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1281409Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1282596Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1283685Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1284762Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1285951Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1287082Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1288208Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1289351Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1290509Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1291667Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1292848Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1294023Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1295332Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1296688Z W1204 10:06:17.356000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.1297955Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1298826Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1299976Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1301071Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1302130Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1303418Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1304681Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1305714Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1306795Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1307986Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1309170Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1310263Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1311342Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1312484Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1313635Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1314774Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1315962Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1317125Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1318367Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1319544Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1320733Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1321951Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1323299Z W1204 10:06:17.659000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.1324245Z PASSED [2.0861s] [ 5%] 2025-12-04T10:13:51.1325165Z inductor/test_compile_subprocess.py::GPUTests::test_allow_reuse_disable_if_exceed_peak_cuda <- test/inductor/test_torchinductor.py PASSED [0.7152s] [ 5%] 2025-12-04T10:13:51.1326196Z inductor/test_compile_subprocess.py::GPUTests::test_angle_cuda <- test/inductor/test_torchinductor.py PASSED [0.8014s] [ 5%] 2025-12-04T10:13:51.1327488Z inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_override_registration_cuda <- test/inductor/test_torchinductor.py W1204 10:06:19.797000 60058 site-packages/torch/_export/__init__.py:71] +============================+ 2025-12-04T10:13:51.1328787Z W1204 10:06:19.797000 60058 site-packages/torch/_export/__init__.py:72] | !!! WARNING !!! | 2025-12-04T10:13:51.1329467Z W1204 10:06:19.797000 60058 site-packages/torch/_export/__init__.py:73] +============================+ 2025-12-04T10:13:51.1330798Z W1204 10:06:19.798000 60058 site-packages/torch/_export/__init__.py:74] torch._export.aot_compile()/torch._export.aot_load() is being deprecated, please switch to directly calling torch._inductor.aoti_compile_and_package(torch.export.export())/torch._inductor.aoti_load_package() instead. 2025-12-04T10:13:51.1331937Z PASSED [28.3986s] [ 5%] 2025-12-04T10:13:51.1332573Z inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_support_out_cuda <- test/inductor/test_torchinductor.py PASSED [5.7466s] [ 6%] 2025-12-04T10:13:51.1333625Z inductor/test_compile_subprocess.py::GPUTests::test_aoti_eager_with_scalar_cuda <- test/inductor/test_torchinductor.py PASSED [24.7990s] [ 6%] 2025-12-04T10:13:51.1335003Z inductor/test_compile_subprocess.py::GPUTests::test_arange4_cuda <- test/inductor/test_torchinductor.py W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1336260Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1337422Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1338514Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1339566Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1340770Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1341941Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1343029Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1344214Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1345402Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1346578Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1347674Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1348749Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1349896Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1351030Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1352158Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1353369Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1354530Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1355695Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1356856Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1358040Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1359264Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1360516Z W1204 10:07:18.438000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.1361674Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1362536Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1363688Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1364785Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1365842Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1367158Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1368322Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1369357Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1370444Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1371632Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1372814Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1373910Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1374989Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1376202Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1377328Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1378445Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1379576Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1380726Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1381868Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1383100Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1384273Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1385496Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1386737Z W1204 10:07:18.655000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.1387569Z PASSED [0.4424s] [ 6%] 2025-12-04T10:13:51.1388176Z inductor/test_compile_subprocess.py::GPUTests::test_argmax_argmin1_cuda <- test/inductor/test_torchinductor.py PASSED [0.7146s] [ 6%] 2025-12-04T10:13:51.1389177Z inductor/test_compile_subprocess.py::GPUTests::test_argmax_argmin2_cuda <- test/inductor/test_torchinductor.py PASSED [0.7951s] [ 6%] 2025-12-04T10:13:51.1390170Z inductor/test_compile_subprocess.py::GPUTests::test_argmax_argmin3_cuda <- test/inductor/test_torchinductor.py PASSED [1.4445s] [ 7%] 2025-12-04T10:13:51.1391624Z inductor/test_compile_subprocess.py::GPUTests::test_argmax_to_float_cuda <- test/inductor/test_torchinductor.py W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1392915Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1394065Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1395159Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1396216Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1397435Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1398609Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1399655Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1400807Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1401996Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1403182Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1404265Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1405331Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1406457Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1407579Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1408699Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1409823Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1410977Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1412124Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1413289Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1414479Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1415789Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1417097Z W1204 10:07:21.833000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.1418015Z PASSED [0.6847s] [ 7%] 2025-12-04T10:13:51.1418644Z inductor/test_compile_subprocess.py::GPUTests::test_as_strided_scatter_cuda <- test/inductor/test_torchinductor.py PASSED [0.8096s] [ 7%] 2025-12-04T10:13:51.1419721Z inductor/test_compile_subprocess.py::GPUTests::test_assert_alignment_op_name_fail_cuda <- test/inductor/test_torchinductor.py PASSED [0.0030s] [ 7%] 2025-12-04T10:13:51.1420816Z inductor/test_compile_subprocess.py::GPUTests::test_assert_size_stride_op_name_pass_cuda <- test/inductor/test_torchinductor.py PASSED [0.0026s] [ 8%] 2025-12-04T10:13:51.1421865Z inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d1_cuda <- test/inductor/test_torchinductor.py PASSED [0.7851s] [ 8%] 2025-12-04T10:13:51.1422845Z inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d4_cuda <- test/inductor/test_torchinductor.py PASSED [0.8704s] [ 8%] 2025-12-04T10:13:51.1423869Z inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d5_cuda <- test/inductor/test_torchinductor.py PASSED [1.1477s] [ 8%] 2025-12-04T10:13:51.1425190Z inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d6_cuda <- test/inductor/test_torchinductor.py PASSED [0.6726s] [ 9%] 2025-12-04T10:13:51.1426158Z inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d8_cuda <- test/inductor/test_torchinductor.py PASSED [1.5144s] [ 9%] 2025-12-04T10:13:51.1427172Z inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d_backward3_cuda <- test/inductor/test_torchinductor.py PASSED [1.4531s] [ 9%] 2025-12-04T10:13:51.1428228Z inductor/test_compile_subprocess.py::GPUTests::test_avg_pool2d_backward4_cuda <- test/inductor/test_torchinductor.py PASSED [0.1242s] [ 9%] 2025-12-04T10:13:51.1429487Z inductor/test_compile_subprocess.py::GPUTests::test_avg_pool3d_backward2_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0003s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 9%] 2025-12-04T10:13:51.1430734Z inductor/test_compile_subprocess.py::GPUTests::test_avg_pool3d_backward3_cuda <- test/inductor/test_torchinductor.py PASSED [2.3278s] [ 10%] 2025-12-04T10:13:51.1431780Z inductor/test_compile_subprocess.py::GPUTests::test_avg_pool3d_backward4_cuda <- test/inductor/test_torchinductor.py PASSED [0.1242s] [ 10%] 2025-12-04T10:13:51.1432777Z inductor/test_compile_subprocess.py::GPUTests::test_baddbmm_cuda <- test/inductor/test_torchinductor.py PASSED [0.7067s] [ 10%] 2025-12-04T10:13:51.1433744Z inductor/test_compile_subprocess.py::GPUTests::test_bernoulli1_cuda <- test/inductor/test_torchinductor.py PASSED [1.3433s] [ 10%] 2025-12-04T10:13:51.1435094Z inductor/test_compile_subprocess.py::GPUTests::test_bernoulli2_cuda <- test/inductor/test_torchinductor.py W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1436384Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1437537Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1438627Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1439693Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1441030Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1442218Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1443266Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1444357Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1445551Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1446739Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1447843Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1448927Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1450204Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1451344Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1452500Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1453645Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1454815Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1455988Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1457155Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1458353Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1459583Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1460879Z W1204 10:07:34.440000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.inductor_seeds.default 2025-12-04T10:13:51.1462076Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1463012Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1464289Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1465382Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1466441Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1467655Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1468838Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1470316Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1471905Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1473424Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1474609Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1475850Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1476948Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1478104Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1479252Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1480387Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1481534Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1482699Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1483861Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1485031Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1486215Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1487440Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1488728Z W1204 10:07:35.368000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.inductor_seeds.default 2025-12-04T10:13:51.1489607Z PASSED [1.1909s] [ 11%] 2025-12-04T10:13:51.1490288Z inductor/test_compile_subprocess.py::GPUTests::test_bitwise2_cuda <- test/inductor/test_torchinductor.py PASSED [0.3127s] [ 11%] 2025-12-04T10:13:51.1491250Z inductor/test_compile_subprocess.py::GPUTests::test_bitwise_cuda <- test/inductor/test_torchinductor.py PASSED [0.2688s] [ 11%] 2025-12-04T10:13:51.1492389Z inductor/test_compile_subprocess.py::GPUTests::test_bmm1_cuda <- test/inductor/test_torchinductor.py PASSED [0.5287s] [ 11%] 2025-12-04T10:13:51.1493321Z inductor/test_compile_subprocess.py::GPUTests::test_bool_cuda <- test/inductor/test_torchinductor.py PASSED [0.4062s] [ 12%] 2025-12-04T10:13:51.1494272Z inductor/test_compile_subprocess.py::GPUTests::test_both_scalars_cuda <- test/inductor/test_torchinductor.py PASSED [0.6810s] [ 12%] 2025-12-04T10:13:51.1495292Z inductor/test_compile_subprocess.py::GPUTests::test_bucketize_broadcast_cuda <- test/inductor/test_torchinductor.py PASSED [0.4527s] [ 12%] 2025-12-04T10:13:51.1496371Z inductor/test_compile_subprocess.py::GPUTests::test_bucketize_computed_offsets_cuda <- test/inductor/test_torchinductor.py PASSED [0.2041s] [ 12%] 2025-12-04T10:13:51.1497457Z inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int16_uint8_cuda <- test/inductor/test_torchinductor.py PASSED [1.3657s] [ 12%] 2025-12-04T10:13:51.1498517Z inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int32_int16_cuda <- test/inductor/test_torchinductor.py PASSED [1.3567s] [ 13%] 2025-12-04T10:13:51.1499669Z inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int32_int32_cuda <- test/inductor/test_torchinductor.py PASSED [1.3248s] [ 13%] 2025-12-04T10:13:51.1500731Z inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int32_int8_cuda <- test/inductor/test_torchinductor.py PASSED [1.3466s] [ 13%] 2025-12-04T10:13:51.1501802Z inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int64_int64_cuda <- test/inductor/test_torchinductor.py PASSED [1.3296s] [ 13%] 2025-12-04T10:13:51.1502870Z inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_int64_int8_cuda <- test/inductor/test_torchinductor.py PASSED [1.3575s] [ 14%] 2025-12-04T10:13:51.1504024Z inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_uint8_int16_cuda <- test/inductor/test_torchinductor.py PASSED [1.3415s] [ 14%] 2025-12-04T10:13:51.1505091Z inductor/test_compile_subprocess.py::GPUTests::test_bucketize_int_uint8_int8_cuda <- test/inductor/test_torchinductor.py PASSED [1.5759s] [ 14%] 2025-12-04T10:13:51.1506215Z inductor/test_compile_subprocess.py::GPUTests::test_bucketize_nd_tiling_True_cuda <- test/inductor/test_torchinductor.py PASSED [0.5045s] [ 14%] 2025-12-04T10:13:51.1507355Z inductor/test_compile_subprocess.py::GPUTests::test_buffer_copied_in_graph_with_different_shapes_cuda <- test/inductor/test_torchinductor.py PASSED [0.4481s] [ 15%] 2025-12-04T10:13:51.1508483Z inductor/test_compile_subprocess.py::GPUTests::test_buffer_use_after_remove_cuda <- test/inductor/test_torchinductor.py PASSED [2.4207s] [ 15%] 2025-12-04T10:13:51.1509576Z inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_float_ndigits_neg_cuda <- test/inductor/test_torchinductor.py PASSED [0.2755s] [ 15%] 2025-12-04T10:13:51.1510708Z inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_float_ndigits_zero_cuda <- test/inductor/test_torchinductor.py PASSED [0.2243s] [ 15%] 2025-12-04T10:13:51.1511821Z inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_int_ndigits_pos_cuda <- test/inductor/test_torchinductor.py PASSED [0.1723s] [ 15%] 2025-12-04T10:13:51.1512925Z inductor/test_compile_subprocess.py::GPUTests::test_builtins_round_int_ndigits_zero_cuda <- test/inductor/test_torchinductor.py PASSED [0.1721s] [ 16%] 2025-12-04T10:13:51.1513978Z inductor/test_compile_subprocess.py::GPUTests::test_cat_inplace_cuda <- test/inductor/test_torchinductor.py PASSED [0.8352s] [ 16%] 2025-12-04T10:13:51.1514973Z inductor/test_compile_subprocess.py::GPUTests::test_cat_negative_dim_cuda <- test/inductor/test_torchinductor.py PASSED [0.9873s] [ 16%] 2025-12-04T10:13:51.1516069Z inductor/test_compile_subprocess.py::GPUTests::test_cat_single_empty_cuda <- test/inductor/test_torchinductor.py PASSED [0.2081s] [ 16%] 2025-12-04T10:13:51.1517063Z inductor/test_compile_subprocess.py::GPUTests::test_cat_unbacked_2d_cuda <- test/inductor/test_torchinductor.py PASSED [0.8054s] [ 17%] 2025-12-04T10:13:51.1518100Z inductor/test_compile_subprocess.py::GPUTests::test_cat_unbacked_legacy_empty_cuda <- test/inductor/test_torchinductor.py PASSED [0.0241s] [ 17%] 2025-12-04T10:13:51.1519155Z inductor/test_compile_subprocess.py::GPUTests::test_chunk_recompiles_cuda <- test/inductor/test_torchinductor.py PASSED [0.9205s] [ 17%] 2025-12-04T10:13:51.1520604Z inductor/test_compile_subprocess.py::GPUTests::test_clamp_type_promotion_non_tensor_cuda <- test/inductor/test_torchinductor.py W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1521965Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1523124Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1524217Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1525709Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1526923Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1528110Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1529145Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1530220Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1531416Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1532603Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1533695Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1534780Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1535973Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1537120Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1538260Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1539391Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1540693Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1541856Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1543102Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1544294Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1545515Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1546834Z W1204 10:07:57.508000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.1547750Z PASSED [0.2263s] [ 17%] 2025-12-04T10:13:51.1548332Z inductor/test_compile_subprocess.py::GPUTests::test_compar_cuda <- test/inductor/test_torchinductor.py PASSED [0.2900s] [ 18%] 2025-12-04T10:13:51.1549823Z inductor/test_compile_subprocess.py::GPUTests::test_complex_fallback_cuda <- test/inductor/test_torchinductor.py W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1551127Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1552293Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1553388Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1554450Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1555664Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1556844Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1557878Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1558961Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1560147Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1561332Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1562434Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1563514Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1564743Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1565879Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1567027Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1568177Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1569342Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1570508Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1571674Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1572859Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1574244Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1575562Z W1204 10:07:58.192000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.1576468Z PASSED [0.3376s] [ 18%] 2025-12-04T10:13:51.1577567Z inductor/test_compile_subprocess.py::GPUTests::test_complex_from_real_imag_cuda <- test/inductor/test_torchinductor.py [W1204 10:07:58.327154612 EmptyTensor.cpp:57] Warning: ComplexHalf support is experimental and many operators don't support it yet. (function operator()) 2025-12-04T10:13:51.1578688Z PASSED [0.2297s] [ 18%] 2025-12-04T10:13:51.1579323Z inductor/test_compile_subprocess.py::GPUTests::test_complex_memory_overlap_cuda <- test/inductor/test_torchinductor.py PASSED [0.0030s] [ 18%] 2025-12-04T10:13:51.1580434Z inductor/test_compile_subprocess.py::GPUTests::test_config_option_dont_assume_alignment_cuda <- test/inductor/test_torchinductor.py PASSED [0.6156s] [ 18%] 2025-12-04T10:13:51.1581890Z inductor/test_compile_subprocess.py::GPUTests::test_const_int32_to_float_cuda <- test/inductor/test_torchinductor.py W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1583269Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1584429Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1585519Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1586578Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1587784Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1589061Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1590102Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1591187Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1592376Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1593561Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1594655Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1595737Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1596872Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1598093Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1599236Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1600379Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1601552Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1602704Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1603882Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1621213Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1622499Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1623892Z W1204 10:07:59.216000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.1625143Z PASSED [0.6446s] [ 19%] 2025-12-04T10:13:51.1625759Z inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_1d_cuda <- test/inductor/test_torchinductor.py PASSED [0.5889s] [ 19%] 2025-12-04T10:13:51.1626844Z inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_2d_strides_nonpositive_cuda <- test/inductor/test_torchinductor.py PASSED [0.3545s] [ 19%] 2025-12-04T10:13:51.1627946Z inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_fill_dtype_cuda <- test/inductor/test_torchinductor.py PASSED [0.2725s] [ 19%] 2025-12-04T10:13:51.1628982Z inductor/test_compile_subprocess.py::GPUTests::test_constant_pad_float64_cuda <- test/inductor/test_torchinductor.py PASSED [0.2708s] [ 20%] 2025-12-04T10:13:51.1630563Z inductor/test_compile_subprocess.py::GPUTests::test_conv1d_with_permute_cuda <- test/inductor/test_torchinductor.py W1204 10:08:01.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1631878Z W1204 10:08:01.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1633050Z W1204 10:08:01.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1634140Z W1204 10:08:01.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1635195Z W1204 10:08:01.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1636419Z W1204 10:08:01.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1637586Z W1204 10:08:01.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1638730Z W1204 10:08:01.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1639756Z W1204 10:08:01.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] AttributeError: Can't pickle local object 'CommonTemplate.test_conv1d_with_permute..ConvModel' 2025-12-04T10:13:51.1640541Z PASSED [0.2927s] [ 20%] 2025-12-04T10:13:51.1641380Z inductor/test_compile_subprocess.py::GPUTests::test_conv3d_channels_last_use_block_ptr_False_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (only support cpu conv3d channels_last) [ 20%] 2025-12-04T10:13:51.1642591Z inductor/test_compile_subprocess.py::GPUTests::test_conv3d_cuda <- test/inductor/test_torchinductor.py PASSED [0.5811s] [ 20%] 2025-12-04T10:13:51.1643566Z inductor/test_compile_subprocess.py::GPUTests::test_conv_backward_cuda <- test/inductor/test_torchinductor.py PASSED [0.3221s] [ 21%] 2025-12-04T10:13:51.1644754Z inductor/test_compile_subprocess.py::GPUTests::test_conv_bn_fuse_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 21%] 2025-12-04T10:13:51.1646043Z inductor/test_compile_subprocess.py::GPUTests::test_conv_inference_heuristics_cuda <- test/inductor/test_torchinductor.py PASSED [0.6306s] [ 21%] 2025-12-04T10:13:51.1647091Z inductor/test_compile_subprocess.py::GPUTests::test_conv_shape_check_cuda <- test/inductor/test_torchinductor.py PASSED [0.0686s] [ 21%] 2025-12-04T10:13:51.1648486Z inductor/test_compile_subprocess.py::GPUTests::test_conv_with_as_strided_cuda <- test/inductor/test_torchinductor.py W1204 10:08:03.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1649794Z W1204 10:08:03.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1650948Z W1204 10:08:03.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1652046Z W1204 10:08:03.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1653103Z W1204 10:08:03.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1654398Z W1204 10:08:03.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1655569Z W1204 10:08:03.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1656602Z W1204 10:08:03.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1657627Z W1204 10:08:03.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] AttributeError: Can't pickle local object 'CommonTemplate.test_conv_with_as_strided..Model' 2025-12-04T10:13:51.1658715Z W1204 10:08:03.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1659585Z W1204 10:08:03.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1660738Z W1204 10:08:03.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1661834Z W1204 10:08:03.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1662887Z W1204 10:08:03.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1664268Z W1204 10:08:03.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1665433Z W1204 10:08:03.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1666526Z W1204 10:08:03.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1667547Z W1204 10:08:03.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] AttributeError: Can't pickle local object 'CommonTemplate.test_conv_with_as_strided..Model' 2025-12-04T10:13:51.1668326Z PASSED [0.9719s] [ 21%] 2025-12-04T10:13:51.1668928Z inductor/test_compile_subprocess.py::GPUTests::test_convolution1_cuda <- test/inductor/test_torchinductor.py PASSED [0.7196s] [ 22%] 2025-12-04T10:13:51.1669934Z inductor/test_compile_subprocess.py::GPUTests::test_convolution3_cuda <- test/inductor/test_torchinductor.py PASSED [0.9145s] [ 22%] 2025-12-04T10:13:51.1670928Z inductor/test_compile_subprocess.py::GPUTests::test_convolution4_cuda <- test/inductor/test_torchinductor.py PASSED [0.8308s] [ 22%] 2025-12-04T10:13:51.1671934Z inductor/test_compile_subprocess.py::GPUTests::test_copy_with_scalar_src_cuda <- test/inductor/test_torchinductor.py PASSED [0.4642s] [ 22%] 2025-12-04T10:13:51.1673310Z inductor/test_compile_subprocess.py::GPUTests::test_cpu_scalar_with_gpu_tensor_dynamic_cuda <- test/inductor/test_torchinductor.py W1204 10:08:07.225000 60274 site-packages/torch/_inductor/utils.py:2565] [0/0] DeviceCopy in input program 2025-12-04T10:13:51.1674292Z PASSED [0.2238s] [ 23%] 2025-12-04T10:13:51.1674935Z inductor/test_compile_subprocess.py::GPUTests::test_cpu_tensor_with_gpu_tensor_cuda <- test/inductor/test_torchinductor.py PASSED [0.0145s] [ 23%] 2025-12-04T10:13:51.1675965Z inductor/test_compile_subprocess.py::GPUTests::test_cumsum_inf_cuda <- test/inductor/test_torchinductor.py PASSED [0.5998s] [ 23%] 2025-12-04T10:13:51.1676992Z inductor/test_compile_subprocess.py::GPUTests::test_cumsum_pattern_matcher_issue_cuda <- test/inductor/test_torchinductor.py PASSED [0.3631s] [ 23%] 2025-12-04T10:13:51.1678472Z inductor/test_compile_subprocess.py::GPUTests::test_custom_op_1_cuda <- test/inductor/test_torchinductor.py W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1679748Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1680906Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1682001Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1683055Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1684265Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1685449Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1686485Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1687564Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1688855Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1690042Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1691150Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1692234Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1693370Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1694499Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1695655Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1696840Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1698001Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1699156Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1700335Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1701513Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1702811Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1704120Z W1204 10:08:08.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.test.foo.default 2025-12-04T10:13:51.1704954Z PASSED [0.3754s] [ 24%] 2025-12-04T10:13:51.1705913Z inductor/test_compile_subprocess.py::GPUTests::test_custom_op_2_cuda <- test/inductor/test_torchinductor.py W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1707183Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1708345Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1709448Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1710510Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1711724Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1712978Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1714016Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1715096Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1716285Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1717465Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1718559Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1719631Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1720768Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1721906Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1723038Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1724178Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1725668Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1726987Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1728161Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1729338Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1730560Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1731803Z W1204 10:08:08.756000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.test.foo2.default 2025-12-04T10:13:51.1732641Z PASSED [0.2257s] [ 24%] 2025-12-04T10:13:51.1733693Z inductor/test_compile_subprocess.py::GPUTests::test_custom_op_default_layout_constraint_cuda <- test/inductor/test_torchinductor.py W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1735063Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1736220Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1737418Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1738472Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1739678Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1740845Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1741877Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1743015Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 96, in reducer_override 2025-12-04T10:13:51.1744179Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _OpPickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1745325Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 418, in reduce_helper 2025-12-04T10:13:51.1746455Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] result = cls.pickle(op, pickler.options) 2025-12-04T10:13:51.1747563Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1748736Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1749915Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1751224Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1752469Z W1204 10:08:08.990000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.mylib.copy_.default 2025-12-04T10:13:51.1753312Z PASSED [0.1833s] [ 24%] 2025-12-04T10:13:51.1754357Z inductor/test_compile_subprocess.py::GPUTests::test_custom_op_fixed_layout_channels_last_cuda <- test/inductor/test_torchinductor.py W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1755716Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1756867Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1757959Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1759012Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1760217Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1761462Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1762493Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1763569Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1764755Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1765977Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1767074Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1768149Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1769288Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1770421Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1771554Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1772694Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1773850Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1775097Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1776265Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1777439Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1778663Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1779938Z W1204 10:08:09.207000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.inductor_seeds.default 2025-12-04T10:13:51.1780817Z PASSED [0.6630s] [ 24%] 2025-12-04T10:13:51.1781856Z inductor/test_compile_subprocess.py::GPUTests::test_custom_op_fixed_layout_sequential_cuda <- test/inductor/test_torchinductor.py W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1783316Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1784471Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1785670Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1786728Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1787937Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1789105Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1790139Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1791226Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1792417Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1793595Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1794697Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1795776Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1796920Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1798051Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1799266Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1800410Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1801577Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1802738Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1803897Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1805090Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1806315Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1807552Z W1204 10:08:09.842000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.test.bar.default 2025-12-04T10:13:51.1808778Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1809643Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1810787Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1811882Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1812937Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1814138Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1815310Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1816350Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1817424Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1818616Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1819789Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1820887Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1821963Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1823232Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1824639Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1825830Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1826969Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1828127Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1829288Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1830457Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1831644Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1832989Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1834223Z W1204 10:08:10.114000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.test.bar.default 2025-12-04T10:13:51.1835062Z PASSED [0.4357s] [ 24%] 2025-12-04T10:13:51.1836090Z inductor/test_compile_subprocess.py::GPUTests::test_custom_op_unbacked_symints_cuda <- test/inductor/test_torchinductor.py W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1837427Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1838573Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1839662Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1840714Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1841923Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1843085Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1844120Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1845198Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1846392Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1847692Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1848788Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1849857Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1850996Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1852122Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1853257Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1854381Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1855536Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1856768Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1857930Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1859108Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1860325Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1861620Z W1204 10:08:10.392000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.test_unbacked_symints.foo.default 2025-12-04T10:13:51.1862519Z PASSED [0.3000s] [ 25%] 2025-12-04T10:13:51.1863173Z inductor/test_compile_subprocess.py::GPUTests::test_custom_scan_op_cuda <- test/inductor/test_torchinductor.py PASSED [0.1971s] [ 25%] 2025-12-04T10:13:51.1864200Z inductor/test_compile_subprocess.py::GPUTests::test_custom_scan_op_multi_input_cuda <- test/inductor/test_torchinductor.py PASSED [0.1311s] [ 25%] 2025-12-04T10:13:51.1865357Z inductor/test_compile_subprocess.py::GPUTests::test_data_type_propogation_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (triton not supported) [ 25%] 2025-12-04T10:13:51.1866478Z inductor/test_compile_subprocess.py::GPUTests::test_dense_mask_index_cuda <- test/inductor/test_torchinductor.py PASSED [0.5136s] [ 26%] 2025-12-04T10:13:51.1867558Z inductor/test_compile_subprocess.py::GPUTests::test_deterministic_codegen_with_suffix_cuda <- test/inductor/test_torchinductor.py PASSED [0.7879s] [ 26%] 2025-12-04T10:13:51.1868624Z inductor/test_compile_subprocess.py::GPUTests::test_diagonal_copy_cuda <- test/inductor/test_torchinductor.py PASSED [0.9010s] [ 26%] 2025-12-04T10:13:51.1869578Z inductor/test_compile_subprocess.py::GPUTests::test_div2_cuda <- test/inductor/test_torchinductor.py PASSED [0.6212s] [ 26%] 2025-12-04T10:13:51.1870489Z inductor/test_compile_subprocess.py::GPUTests::test_div3_cuda <- test/inductor/test_torchinductor.py PASSED [0.3285s] [ 27%] 2025-12-04T10:13:51.1871400Z inductor/test_compile_subprocess.py::GPUTests::test_div4_cuda <- test/inductor/test_torchinductor.py PASSED [0.3163s] [ 27%] 2025-12-04T10:13:51.1872419Z inductor/test_compile_subprocess.py::GPUTests::test_div8_cuda <- test/inductor/test_torchinductor.py PASSED [0.6759s] [ 27%] 2025-12-04T10:13:51.1873333Z inductor/test_compile_subprocess.py::GPUTests::test_div9_cuda <- test/inductor/test_torchinductor.py PASSED [0.3696s] [ 27%] 2025-12-04T10:13:51.1874266Z inductor/test_compile_subprocess.py::GPUTests::test_div_by_zero_cuda <- test/inductor/test_torchinductor.py PASSED [0.5485s] [ 27%] 2025-12-04T10:13:51.1875627Z inductor/test_compile_subprocess.py::GPUTests::test_div_precision_cuda <- test/inductor/test_torchinductor.py W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1876913Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1878065Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1879163Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1880226Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1881594Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1882765Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1883796Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1884873Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1886109Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1887294Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1888379Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1889456Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1890596Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1891732Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1892871Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1894001Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1895167Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1896400Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1897572Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1898741Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1899968Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1901267Z W1204 10:08:16.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.1902485Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1903401Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1904541Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1905709Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1906766Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1907976Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1909140Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1910172Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1911247Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1912427Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1913609Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1914695Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1915772Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1916918Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1918057Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1919206Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1920416Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1921574Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1922729Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1923905Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1925407Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1926636Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1927949Z W1204 10:08:16.998000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.1928991Z PASSED [1.4223s] [ 28%] 2025-12-04T10:13:51.1929644Z inductor/test_compile_subprocess.py::GPUTests::test_div_presicion_accuracy_cuda <- test/inductor/test_torchinductor.py PASSED [0.3695s] [ 28%] 2025-12-04T10:13:51.1931035Z inductor/test_compile_subprocess.py::GPUTests::test_div_prim_cuda <- test/inductor/test_torchinductor.py W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1932307Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1933466Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1934562Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1935631Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1936832Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1938021Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1939062Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1940136Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1941322Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1942512Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1943662Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1944850Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1945983Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1947116Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1948251Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1949401Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1950568Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1951724Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1952889Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1954172Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1955399Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1956653Z W1204 10:08:17.775000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.div.default 2025-12-04T10:13:51.1957787Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1958649Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1959798Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1960885Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1961938Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1963134Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1964297Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1965335Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1966408Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1967587Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1968865Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1969961Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1971029Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1972172Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1973295Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.1974423Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.1975548Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.1976705Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.1977938Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.1979093Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.1980274Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.1981497Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.1982733Z W1204 10:08:17.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.div.default 2025-12-04T10:13:51.1983926Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.1984785Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.1985985Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.1987069Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.1988124Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.1989330Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.1990497Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.1991527Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.1992696Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.1993885Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.1995064Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.1996150Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.1997224Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.1998364Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.1999488Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2000622Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2001826Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2002980Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2004139Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2005301Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2006484Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2007706Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2008942Z W1204 10:08:18.075000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.div.default 2025-12-04T10:13:51.2009775Z PASSED [0.5147s] [ 28%] 2025-12-04T10:13:51.2011077Z inductor/test_compile_subprocess.py::GPUTests::test_div_softmax_symfloat_cuda <- test/inductor/test_torchinductor.py W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2013067Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] Traceback (most recent call last): 2025-12-04T10:13:51.2014807Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2016517Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] ).serialize() 2025-12-04T10:13:51.2018231Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2019952Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2021649Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2023228Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] pickler.dump(obj) 2025-12-04T10:13:51.2025031Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2026841Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2028549Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2030235Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] cls(obj, pickler.options), 2025-12-04T10:13:51.2031828Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2033733Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2035520Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2037384Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2039344Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2041381Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2043250Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2045078Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2047012Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2049045Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2051280Z W1204 10:08:18.447000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2053381Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2064756Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] Traceback (most recent call last): 2025-12-04T10:13:51.2066115Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2067222Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] ).serialize() 2025-12-04T10:13:51.2068294Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2069519Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2070711Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2071761Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] pickler.dump(obj) 2025-12-04T10:13:51.2072856Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2074062Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2075373Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2076479Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] cls(obj, pickler.options), 2025-12-04T10:13:51.2077576Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2078739Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2079882Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2081049Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2082195Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2083368Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2084545Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2085760Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2086960Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2088202Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2089529Z W1204 10:08:19.559000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2090538Z PASSED [2.2719s] [ 28%] 2025-12-04T10:13:51.2091272Z inductor/test_compile_subprocess.py::GPUTests::test_dont_constant_fold_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (triton not supported) [ 29%] 2025-12-04T10:13:51.2092413Z inductor/test_compile_subprocess.py::GPUTests::test_dropout_trivial_1_cuda <- test/inductor/test_torchinductor.py PASSED [0.2206s] [ 29%] 2025-12-04T10:13:51.2093847Z inductor/test_compile_subprocess.py::GPUTests::test_dtype_mismatch_issue_cuda <- test/inductor/test_torchinductor.py W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2095189Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2096412Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2097519Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2098592Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2099906Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2101088Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2102126Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2103272Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2104476Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2105669Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2106782Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2107875Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2109025Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2110169Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2111315Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2112460Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2113633Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2114885Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2116070Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2117268Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2118500Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2119827Z W1204 10:08:21.617000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2120742Z PASSED [1.5920s] [ 29%] 2025-12-04T10:13:51.2121365Z inductor/test_compile_subprocess.py::GPUTests::test_dtype_sympy_expr_cuda <- test/inductor/test_torchinductor.py PASSED [2.1440s] [ 29%] 2025-12-04T10:13:51.2122431Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_bfloat16_cuda <- test/inductor/test_torchinductor.py PASSED [0.6000s] [ 30%] 2025-12-04T10:13:51.2123533Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_int32_cuda <- test/inductor/test_torchinductor.py PASSED [0.8983s] [ 30%] 2025-12-04T10:13:51.2125098Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_bfloat16_int64_cuda <- test/inductor/test_torchinductor.py PASSED [0.6350s] [ 30%] 2025-12-04T10:13:51.2126190Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float16_bfloat16_cuda <- test/inductor/test_torchinductor.py PASSED [0.5882s] [ 30%] 2025-12-04T10:13:51.2127282Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_float16_cuda <- test/inductor/test_torchinductor.py PASSED [0.3114s] [ 30%] 2025-12-04T10:13:51.2128381Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_float64_cuda <- test/inductor/test_torchinductor.py PASSED [0.3084s] [ 31%] 2025-12-04T10:13:51.2129466Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_int64_cuda <- test/inductor/test_torchinductor.py PASSED [0.3006s] [ 31%] 2025-12-04T10:13:51.2130545Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float32_int8_cuda <- test/inductor/test_torchinductor.py PASSED [0.0042s] [ 31%] 2025-12-04T10:13:51.2131633Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_bfloat16_cuda <- test/inductor/test_torchinductor.py PASSED [0.0039s] [ 31%] 2025-12-04T10:13:51.2132723Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_float32_cuda <- test/inductor/test_torchinductor.py PASSED [0.2462s] [ 32%] 2025-12-04T10:13:51.2133827Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_float64_cuda <- test/inductor/test_torchinductor.py PASSED [0.2077s] [ 32%] 2025-12-04T10:13:51.2134909Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_int16_cuda <- test/inductor/test_torchinductor.py PASSED [0.0042s] [ 32%] 2025-12-04T10:13:51.2136035Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_float64_uint8_cuda <- test/inductor/test_torchinductor.py PASSED [0.0039s] [ 32%] 2025-12-04T10:13:51.2137083Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_fusion_cuda <- test/inductor/test_torchinductor.py PASSED [0.4356s] [ 33%] 2025-12-04T10:13:51.2138133Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_bfloat16_cuda <- test/inductor/test_torchinductor.py PASSED [0.5963s] [ 33%] 2025-12-04T10:13:51.2139207Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_float16_cuda <- test/inductor/test_torchinductor.py PASSED [0.5964s] [ 33%] 2025-12-04T10:13:51.2140416Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_float64_cuda <- test/inductor/test_torchinductor.py PASSED [0.6229s] [ 33%] 2025-12-04T10:13:51.2141473Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_int16_cuda <- test/inductor/test_torchinductor.py PASSED [0.5813s] [ 33%] 2025-12-04T10:13:51.2142529Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int16_int64_cuda <- test/inductor/test_torchinductor.py PASSED [0.6204s] [ 34%] 2025-12-04T10:13:51.2143662Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_float16_cuda <- test/inductor/test_torchinductor.py PASSED [0.3141s] [ 34%] 2025-12-04T10:13:51.2144730Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_float64_cuda <- test/inductor/test_torchinductor.py PASSED [0.3088s] [ 34%] 2025-12-04T10:13:51.2145778Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int32_int64_cuda <- test/inductor/test_torchinductor.py PASSED [0.3077s] [ 34%] 2025-12-04T10:13:51.2146841Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_float32_cuda <- test/inductor/test_torchinductor.py PASSED [0.2276s] [ 35%] 2025-12-04T10:13:51.2147903Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_float64_cuda <- test/inductor/test_torchinductor.py PASSED [0.2200s] [ 35%] 2025-12-04T10:13:51.2149217Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_int64_cuda <- test/inductor/test_torchinductor.py PASSED [0.2104s] [ 35%] 2025-12-04T10:13:51.2150968Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_int8_cuda <- test/inductor/test_torchinductor.py PASSED [0.0041s] [ 35%] 2025-12-04T10:13:51.2152039Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int64_uint8_cuda <- test/inductor/test_torchinductor.py PASSED [0.0039s] [ 36%] 2025-12-04T10:13:51.2153097Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_bfloat16_cuda <- test/inductor/test_torchinductor.py PASSED [0.0038s] [ 36%] 2025-12-04T10:13:51.2154168Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_float16_cuda <- test/inductor/test_torchinductor.py PASSED [0.0039s] [ 36%] 2025-12-04T10:13:51.2155237Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_float64_cuda <- test/inductor/test_torchinductor.py PASSED [0.0038s] [ 36%] 2025-12-04T10:13:51.2156288Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_int32_cuda <- test/inductor/test_torchinductor.py PASSED [0.0039s] [ 36%] 2025-12-04T10:13:51.2157337Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_int64_cuda <- test/inductor/test_torchinductor.py PASSED [0.0038s] [ 37%] 2025-12-04T10:13:51.2158379Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_int8_uint8_cuda <- test/inductor/test_torchinductor.py PASSED [0.0039s] [ 37%] 2025-12-04T10:13:51.2159436Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_bfloat16_cuda <- test/inductor/test_torchinductor.py PASSED [0.0038s] [ 37%] 2025-12-04T10:13:51.2160508Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_int16_cuda <- test/inductor/test_torchinductor.py PASSED [0.0038s] [ 37%] 2025-12-04T10:13:51.2161561Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_int32_cuda <- test/inductor/test_torchinductor.py PASSED [0.0038s] [ 38%] 2025-12-04T10:13:51.2162610Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_int8_cuda <- test/inductor/test_torchinductor.py PASSED [0.0043s] [ 38%] 2025-12-04T10:13:51.2163656Z inductor/test_compile_subprocess.py::GPUTests::test_dtypeview_uint8_uint8_cuda <- test/inductor/test_torchinductor.py PASSED [0.0038s] [ 38%] 2025-12-04T10:13:51.2165026Z inductor/test_compile_subprocess.py::GPUTests::test_elu_cuda <- test/inductor/test_torchinductor.py W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2166281Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2167560Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2168651Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2169715Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2170935Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2172113Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2173157Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2174234Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2175424Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2176723Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2177824Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2178912Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2180046Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2181186Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2182326Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2183532Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2184696Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2185856Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2187035Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2188231Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2189458Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2190852Z W1204 10:08:34.209000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2191763Z PASSED [0.8067s] [ 38%] 2025-12-04T10:13:51.2192578Z inductor/test_compile_subprocess.py::GPUTests::test_embedding_bag_byte_unpack_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (No cuda implementation (it returns empty)) [ 39%] 2025-12-04T10:13:51.2193767Z inductor/test_compile_subprocess.py::GPUTests::test_embedding_cuda <- test/inductor/test_torchinductor.py PASSED [0.5955s] [ 39%] 2025-12-04T10:13:51.2194752Z inductor/test_compile_subprocess.py::GPUTests::test_empty_strided_cuda <- test/inductor/test_torchinductor.py PASSED [0.1468s] [ 39%] 2025-12-04T10:13:51.2195740Z inductor/test_compile_subprocess.py::GPUTests::test_exact_stride_cuda <- test/inductor/test_torchinductor.py PASSED [0.4455s] [ 39%] 2025-12-04T10:13:51.2196697Z inductor/test_compile_subprocess.py::GPUTests::test_exp2_cuda <- test/inductor/test_torchinductor.py PASSED [0.5972s] [ 39%] 2025-12-04T10:13:51.2197621Z inductor/test_compile_subprocess.py::GPUTests::test_exp_cuda <- test/inductor/test_torchinductor.py PASSED [0.4667s] [ 40%] 2025-12-04T10:13:51.2198552Z inductor/test_compile_subprocess.py::GPUTests::test_expand_as_cuda <- test/inductor/test_torchinductor.py PASSED [0.5465s] [ 40%] 2025-12-04T10:13:51.2199509Z inductor/test_compile_subprocess.py::GPUTests::test_expand_cuda <- test/inductor/test_torchinductor.py PASSED [0.5916s] [ 40%] 2025-12-04T10:13:51.2200979Z inductor/test_compile_subprocess.py::GPUTests::test_fft_real_input_real_output_cuda <- test/inductor/test_torchinductor.py W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2202306Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2203478Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2204569Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2205631Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2206850Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2208028Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2209069Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2210151Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2211347Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2212538Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2213641Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2214715Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2215936Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2217078Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2218219Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2219367Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2220524Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2221693Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2222867Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2224102Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2225658Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2226975Z W1204 10:08:38.044000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2227888Z PASSED [0.1651s] [ 40%] 2025-12-04T10:13:51.2228465Z inductor/test_compile_subprocess.py::GPUTests::test_fill2_cuda <- test/inductor/test_torchinductor.py PASSED [0.5105s] [ 41%] 2025-12-04T10:13:51.2229535Z inductor/test_compile_subprocess.py::GPUTests::test_flexible_layout_immutable_free_symbols_cuda <- test/inductor/test_torchinductor.py PASSED [0.0034s] [ 41%] 2025-12-04T10:13:51.2230969Z inductor/test_compile_subprocess.py::GPUTests::test_flip_cat_cuda <- test/inductor/test_torchinductor.py W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2232247Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2233408Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2234511Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2235591Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2236836Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2238018Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2239058Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2240279Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2241472Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2242659Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2243767Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2244847Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2245993Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2247136Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2248277Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2249530Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2250693Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2251861Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2253028Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2254214Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2255445Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2256693Z W1204 10:08:38.713000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.rev.default 2025-12-04T10:13:51.2257850Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2258718Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2259409Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2259718Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2260372Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2260832Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2261556Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2261884Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2262535Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2263063Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2263701Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2264060Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2264685Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2265099Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2265856Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2266269Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2266895Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2267334Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2267946Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2268410Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2269038Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2269537Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2270184Z W1204 10:08:38.959000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.rev.default 2025-12-04T10:13:51.2270279Z PASSED [0.4726s] [ 41%] 2025-12-04T10:13:51.2271062Z inductor/test_compile_subprocess.py::GPUTests::test_flip_cuda <- test/inductor/test_torchinductor.py W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2271442Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2272128Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2272603Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2273252Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2273718Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2274333Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2274657Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2275325Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2275767Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2276417Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2276854Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2277480Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2277904Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2278524Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2278943Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2279567Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2280010Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2280629Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2281083Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2281720Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2282217Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2282873Z W1204 10:08:39.153000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.rev.default 2025-12-04T10:13:51.2283275Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2283726Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2284416Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2284725Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2285379Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2285839Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2286463Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2286786Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2287439Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2287958Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2288595Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2288963Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2289580Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2289998Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2290619Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2291033Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2291659Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2292098Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2292716Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2293172Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2293804Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2294372Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2295018Z W1204 10:08:39.357000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.rev.default 2025-12-04T10:13:51.2295115Z PASSED [0.3817s] [ 41%] 2025-12-04T10:13:51.2296029Z inductor/test_compile_subprocess.py::GPUTests::test_float_repr_dynamic_shapes_cuda <- test/inductor/test_torchinductor.py W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2296408Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2297092Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2297408Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2298054Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2298516Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2299219Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2299542Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2300202Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2300642Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2301287Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2301649Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2302267Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2302690Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2303359Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2303779Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2304403Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2304840Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2305462Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2306021Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2306658Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2307154Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2307812Z W1204 10:08:39.751000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.2308213Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2308581Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] Traceback (most recent call last): 2025-12-04T10:13:51.2309271Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2309578Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] ).serialize() 2025-12-04T10:13:51.2310305Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2310763Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2311385Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2311707Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] pickler.dump(obj) 2025-12-04T10:13:51.2312354Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2312807Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2313446Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2313811Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2314432Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2314849Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2315473Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2315887Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2316590Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2317030Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2317654Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2318113Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2318749Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2319244Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2319900Z W1204 10:08:40.452000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.2320310Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2320755Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2321447Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2321754Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2322410Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2322879Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2323490Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2323829Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2324761Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2325221Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2325911Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2326266Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2326913Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2327326Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2328106Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2328526Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2329220Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2329876Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2330867Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2331568Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2332565Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2333328Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2334691Z W1204 10:08:40.982000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2335250Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2335755Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] Traceback (most recent call last): 2025-12-04T10:13:51.2336786Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2337266Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] ).serialize() 2025-12-04T10:13:51.2338265Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2338958Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2339865Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2340368Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] pickler.dump(obj) 2025-12-04T10:13:51.2341425Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2342136Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2343250Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2343820Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2344963Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2345632Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2346633Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2347311Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2348307Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2349007Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2349981Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2350720Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2351881Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2352654Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2353903Z W1204 10:08:41.684000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2354061Z PASSED [2.4874s] [ 42%] 2025-12-04T10:13:51.2354811Z inductor/test_compile_subprocess.py::GPUTests::test_floordiv_cuda <- test/inductor/test_torchinductor.py PASSED [0.5618s] [ 42%] 2025-12-04T10:13:51.2356273Z inductor/test_compile_subprocess.py::GPUTests::test_forced_buffer_realize_cuda <- test/inductor/test_torchinductor.py W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2356917Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2358123Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2358657Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2359800Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2360585Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2361667Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2362228Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2363494Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2364226Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2365317Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2365949Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2367037Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2367761Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2368839Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2369550Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2370698Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2371418Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2372437Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2373164Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2374210Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2375030Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2376183Z W1204 10:08:42.573000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops._inductor_test.realize.default 2025-12-04T10:13:51.2376855Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2377460Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2378604Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2379107Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2380183Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2380949Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2382087Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2382631Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2383825Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2384507Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2385560Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2386131Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2387163Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2387862Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2389034Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2389689Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2390713Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2391408Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2392423Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2393143Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2394149Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2394965Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2396131Z W1204 10:08:42.709000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops._inductor_test.realize.default 2025-12-04T10:13:51.2396279Z PASSED [0.2758s] [ 42%] 2025-12-04T10:13:51.2397031Z inductor/test_compile_subprocess.py::GPUTests::test_fractional_max_pool2d3_cuda <- test/inductor/test_torchinductor.py PASSED [0.4985s] [ 42%] 2025-12-04T10:13:51.2397829Z inductor/test_compile_subprocess.py::GPUTests::test_fractional_max_pool2d5_cuda <- test/inductor/test_torchinductor.py PASSED [0.8324s] [ 42%] 2025-12-04T10:13:51.2398553Z inductor/test_compile_subprocess.py::GPUTests::test_full_boolean_cuda <- test/inductor/test_torchinductor.py PASSED [0.4120s] [ 43%] 2025-12-04T10:13:51.2399258Z inductor/test_compile_subprocess.py::GPUTests::test_full_like_cuda <- test/inductor/test_torchinductor.py PASSED [0.3183s] [ 43%] 2025-12-04T10:13:51.2400146Z inductor/test_compile_subprocess.py::GPUTests::test_functionalize_rng_wrappers_cuda <- test/inductor/test_torchinductor.py PASSED [0.0461s] [ 43%] 2025-12-04T10:13:51.2400819Z inductor/test_compile_subprocess.py::GPUTests::test_gather1_cuda <- test/inductor/test_torchinductor.py PASSED [0.7777s] [ 43%] 2025-12-04T10:13:51.2401529Z inductor/test_compile_subprocess.py::GPUTests::test_gather_scatter_cuda <- test/inductor/test_torchinductor.py PASSED [0.7793s] [ 44%] 2025-12-04T10:13:51.2402787Z inductor/test_compile_subprocess.py::GPUTests::test_gelu_cuda <- test/inductor/test_torchinductor.py W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2403383Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2404529Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2404997Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2405992Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2406904Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2407945Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2408486Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2409598Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2410379Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2411521Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2412114Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2413122Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2413819Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2414808Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2415451Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2416443Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2417149Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2418961Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2419753Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2420817Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2421679Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2422884Z W1204 10:08:46.927000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2423122Z PASSED [0.7518s] [ 44%] 2025-12-04T10:13:51.2423894Z inductor/test_compile_subprocess.py::GPUTests::test_getitem_cuda <- test/inductor/test_torchinductor.py PASSED [0.0205s] [ 44%] 2025-12-04T10:13:51.2425443Z inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_arange2_cuda <- test/inductor/test_torchinductor.py W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2426358Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2427347Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2427706Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2428377Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2428842Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2429465Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2429799Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2430459Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2430905Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2431553Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2431915Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2432544Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2432968Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2433741Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2434170Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2434792Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2435252Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2435870Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2436336Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2436974Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2437471Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2438215Z W1204 10:08:47.307000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.2438622Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2438997Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] Traceback (most recent call last): 2025-12-04T10:13:51.2439692Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2440004Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] ).serialize() 2025-12-04T10:13:51.2440665Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2441133Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2441751Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2442080Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] pickler.dump(obj) 2025-12-04T10:13:51.2442733Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2443184Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2443829Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2444198Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] cls(obj, pickler.options), 2025-12-04T10:13:51.2444904Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2445328Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2445949Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2446370Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2446998Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2447443Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2448069Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2448525Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2449240Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2449743Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2450404Z W1204 10:08:47.441000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.2450503Z PASSED [0.2718s] [ 44%] 2025-12-04T10:13:51.2451035Z inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_misaligned_input_cuda <- test/inductor/test_torchinductor.py PASSED [1.1132s] [ 45%] 2025-12-04T10:13:51.2451945Z inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_mutation_real_name_cuda <- test/inductor/test_torchinductor.py W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2452323Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2453025Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2453341Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2453995Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2454467Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2455087Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2455429Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2456170Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2456614Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2457267Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2457631Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2458266Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2458689Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2459319Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2459737Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2460446Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2460898Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2461523Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2461984Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2462617Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2463224Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2463942Z W1204 10:08:48.778000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2464267Z W1204 10:08:48.800000 60058 site-packages/torch/_inductor/utils.py:2565] [0/0] DeviceCopy in input program 2025-12-04T10:13:51.2464601Z W1204 10:08:48.801000 60058 site-packages/torch/_inductor/utils.py:2565] [0/0] DeviceCopy in input program 2025-12-04T10:13:51.2464694Z PASSED [0.2998s] [ 45%] 2025-12-04T10:13:51.2465212Z inductor/test_compile_subprocess.py::GPUTests::test_graph_partition_scalar_inputs_cuda <- test/inductor/test_torchinductor.py PASSED [0.6195s] [ 45%] 2025-12-04T10:13:51.2466070Z inductor/test_compile_subprocess.py::GPUTests::test_hardtanh_cuda <- test/inductor/test_torchinductor.py W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2466446Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2467146Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2467565Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2468229Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2468691Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2469318Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2469643Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2470300Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2470749Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2471389Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2471832Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2472458Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2472884Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2473506Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2473923Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2474558Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2475000Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2475631Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2476091Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2476725Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2477224Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2477942Z W1204 10:08:49.879000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2478038Z PASSED [0.4672s] [ 45%] 2025-12-04T10:13:51.2478585Z inductor/test_compile_subprocess.py::GPUTests::test_horizonal_fusion1_cuda <- test/inductor/test_torchinductor.py PASSED [0.6038s] [ 45%] 2025-12-04T10:13:51.2479020Z inductor/test_compile_subprocess.py::GPUTests::test_index2_cuda <- test/inductor/test_torchinductor.py PASSED [0.8302s] [ 46%] 2025-12-04T10:13:51.2479454Z inductor/test_compile_subprocess.py::GPUTests::test_index_put1_cuda <- test/inductor/test_torchinductor.py PASSED [2.0052s] [ 46%] 2025-12-04T10:13:51.2479888Z inductor/test_compile_subprocess.py::GPUTests::test_index_put3_cuda <- test/inductor/test_torchinductor.py PASSED [0.9233s] [ 46%] 2025-12-04T10:13:51.2480322Z inductor/test_compile_subprocess.py::GPUTests::test_index_put4_cuda <- test/inductor/test_torchinductor.py PASSED [0.3803s] [ 46%] 2025-12-04T10:13:51.2480797Z inductor/test_compile_subprocess.py::GPUTests::test_index_put_as_masked_fill_cuda <- test/inductor/test_torchinductor.py PASSED [1.0167s] [ 47%] 2025-12-04T10:13:51.2481327Z inductor/test_compile_subprocess.py::GPUTests::test_index_put_deterministic_fallback_cuda <- test/inductor/test_torchinductor.py PASSED [0.2390s] [ 47%] 2025-12-04T10:13:51.2481816Z inductor/test_compile_subprocess.py::GPUTests::test_index_put_failed_reinplace_cuda <- test/inductor/test_torchinductor.py PASSED [0.4993s] [ 47%] 2025-12-04T10:13:51.2482263Z inductor/test_compile_subprocess.py::GPUTests::test_index_put_index_cuda <- test/inductor/test_torchinductor.py PASSED [0.7238s] [ 47%] 2025-12-04T10:13:51.2482814Z inductor/test_compile_subprocess.py::GPUTests::test_index_put_reinplace_cuda <- test/inductor/test_torchinductor.py PASSED [0.4479s] [ 48%] 2025-12-04T10:13:51.2483270Z inductor/test_compile_subprocess.py::GPUTests::test_index_remainder_cuda <- test/inductor/test_torchinductor.py PASSED [0.4612s] [ 48%] 2025-12-04T10:13:51.2483760Z inductor/test_compile_subprocess.py::GPUTests::test_indirect_load_broadcast_cuda <- test/inductor/test_torchinductor.py PASSED [1.6839s] [ 48%] 2025-12-04T10:13:51.2484170Z inductor/test_compile_subprocess.py::GPUTests::test_inf_cuda <- test/inductor/test_torchinductor.py PASSED [0.3728s] [ 48%] 2025-12-04T10:13:51.2484659Z inductor/test_compile_subprocess.py::GPUTests::test_inner_reduction_detection_cuda <- test/inductor/test_torchinductor.py PASSED [0.2847s] [ 48%] 2025-12-04T10:13:51.2485099Z inductor/test_compile_subprocess.py::GPUTests::test_inplace_add_cuda <- test/inductor/test_torchinductor.py PASSED [0.1801s] [ 49%] 2025-12-04T10:13:51.2485930Z inductor/test_compile_subprocess.py::GPUTests::test_inplace_flip_cuda <- test/inductor/test_torchinductor.py W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2486310Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2487005Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2487319Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2487972Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2488440Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2489060Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2489385Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2490208Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2490658Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2491306Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2491667Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2492292Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2492718Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2493341Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2493765Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2494465Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2494915Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2495537Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2495992Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2496627Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2497127Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2497783Z W1204 10:09:01.036000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.rev.default 2025-12-04T10:13:51.2498193Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2498561Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2499274Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2505239Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2505945Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2506407Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2507137Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2507462Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2508112Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2508564Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2509199Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2509559Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2510182Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2510605Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2511302Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2511716Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2512344Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2512780Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2513400Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2514031Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2515071Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2515808Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2516526Z W1204 10:09:01.561000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.rev.default 2025-12-04T10:13:51.2516629Z PASSED [1.3085s] [ 49%] 2025-12-04T10:13:51.2517104Z inductor/test_compile_subprocess.py::GPUTests::test_inplace_resize_as_cuda <- test/inductor/test_torchinductor.py PASSED [0.0492s] [ 49%] 2025-12-04T10:13:51.2517573Z inductor/test_compile_subprocess.py::GPUTests::test_input_mutation2_cuda <- test/inductor/test_torchinductor.py PASSED [0.2320s] [ 49%] 2025-12-04T10:13:51.2518027Z inductor/test_compile_subprocess.py::GPUTests::test_input_mutation5_cuda <- test/inductor/test_torchinductor.py PASSED [0.1704s] [ 50%] 2025-12-04T10:13:51.2518508Z inductor/test_compile_subprocess.py::GPUTests::test_insignificant_strides_cuda <- test/inductor/test_torchinductor.py PASSED [0.1684s] [ 50%] 2025-12-04T10:13:51.2519141Z inductor/test_compile_subprocess.py::GPUTests::test_int8_weight_only_quant_cuda <- test/inductor/test_torchinductor.py PASSED [0.3307s] [ 50%] 2025-12-04T10:13:51.2519604Z inductor/test_compile_subprocess.py::GPUTests::test_isin_tensor_scalar_cuda <- test/inductor/test_torchinductor.py PASSED [0.8823s] [ 50%] 2025-12-04T10:13:51.2520048Z inductor/test_compile_subprocess.py::GPUTests::test_issue102546_cuda <- test/inductor/test_torchinductor.py PASSED [0.3081s] [ 51%] 2025-12-04T10:13:51.2520490Z inductor/test_compile_subprocess.py::GPUTests::test_kernel_names_cuda <- test/inductor/test_torchinductor.py PASSED [0.1569s] [ 51%] 2025-12-04T10:13:51.2521035Z inductor/test_compile_subprocess.py::GPUTests::test_kwargs_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (histogramdd only supports cpu) [ 51%] 2025-12-04T10:13:51.2521464Z inductor/test_compile_subprocess.py::GPUTests::test_l1_loss_cuda <- test/inductor/test_torchinductor.py PASSED [0.2780s] [ 51%] 2025-12-04T10:13:51.2521921Z inductor/test_compile_subprocess.py::GPUTests::test_large_block_sizes_cuda <- test/inductor/test_torchinductor.py PASSED [5.3903s] [ 51%] 2025-12-04T10:13:51.2522413Z inductor/test_compile_subprocess.py::GPUTests::test_large_broadcast_reduction_cuda <- test/inductor/test_torchinductor.py PASSED [0.3216s] [ 52%] 2025-12-04T10:13:51.2522890Z inductor/test_compile_subprocess.py::GPUTests::test_large_offset_pointwise_cuda <- test/inductor/test_torchinductor.py PASSED [0.6499s] [ 52%] 2025-12-04T10:13:51.2523467Z inductor/test_compile_subprocess.py::GPUTests::test_large_tensor_reduction_cuda <- test/inductor/test_torchinductor.py PASSED [0.8224s] [ 52%] 2025-12-04T10:13:51.2523968Z inductor/test_compile_subprocess.py::GPUTests::test_linalg_eig_stride_consistency_cuda <- test/inductor/test_torchinductor.py PASSED [0.1362s] [ 52%] 2025-12-04T10:13:51.2525849Z inductor/test_compile_subprocess.py::GPUTests::test_linspace2_cuda <- test/inductor/test_torchinductor.py W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2526263Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2526961Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2527281Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2527933Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2528393Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2529019Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2529343Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2529997Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2530439Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2531082Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2531632Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2532256Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2532679Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2533301Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2533719Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2534340Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2534782Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2535392Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2535954Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2536586Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2537080Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2537797Z W1204 10:09:12.096000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2537890Z PASSED [0.2481s] [ 53%] 2025-12-04T10:13:51.2538469Z inductor/test_compile_subprocess.py::GPUTests::test_linspace4_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (requires multiple cuda devices) [ 53%] 2025-12-04T10:13:51.2539126Z inductor/test_compile_subprocess.py::GPUTests::test_list_clearing_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0004s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 53%] 2025-12-04T10:13:51.2539601Z inductor/test_compile_subprocess.py::GPUTests::test_lite_mode_not_decompose_cuda <- test/inductor/test_torchinductor.py PASSED [0.4041s] [ 53%] 2025-12-04T10:13:51.2540143Z inductor/test_compile_subprocess.py::GPUTests::test_lite_triton_kernel_wrapper_functional_cuda <- test/inductor/test_torchinductor.py PASSED [0.4741s] [ 54%] 2025-12-04T10:13:51.2540549Z inductor/test_compile_subprocess.py::GPUTests::test_log2_cuda <- test/inductor/test_torchinductor.py PASSED [0.4730s] [ 54%] 2025-12-04T10:13:51.2540973Z inductor/test_compile_subprocess.py::GPUTests::test_log_fp64_cuda <- test/inductor/test_torchinductor.py PASSED [0.3987s] [ 54%] 2025-12-04T10:13:51.2541507Z inductor/test_compile_subprocess.py::GPUTests::test_logaddexp_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (Not implemented for CUDA) [ 54%] 2025-12-04T10:13:51.2541947Z inductor/test_compile_subprocess.py::GPUTests::test_logcumsumexp_cuda <- test/inductor/test_torchinductor.py PASSED [4.6057s] [ 54%] 2025-12-04T10:13:51.2542422Z inductor/test_compile_subprocess.py::GPUTests::test_logcumsumexp_zero_dim_cuda <- test/inductor/test_torchinductor.py PASSED [0.3152s] [ 55%] 2025-12-04T10:13:51.2543410Z inductor/test_compile_subprocess.py::GPUTests::test_logsumexp_cuda <- test/inductor/test_torchinductor.py W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2543789Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2544480Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2544789Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2545441Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2545901Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2546517Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2546920Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2547572Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2548008Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2548648Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2549006Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2549621Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2550047Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2550661Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2551081Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2551695Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2552133Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2552755Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2553206Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2553971Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2554463Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2555179Z W1204 10:09:19.490000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2555272Z PASSED [1.0345s] [ 55%] 2025-12-04T10:13:51.2556152Z inductor/test_compile_subprocess.py::GPUTests::test_low_memory_max_pool_dilation_1_dim_3_cuda <- test/inductor/test_torchinductor.py W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2556526Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2557215Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2557525Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2558172Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2558718Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2559326Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2559651Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2560306Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2560745Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2561392Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2561744Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2562363Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2562786Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2563401Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2563826Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2564442Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2564967Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2565579Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2566031Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2566670Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2567160Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2567924Z W1204 10:09:20.022000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2568328Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2568707Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2569496Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2569800Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2570459Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2570918Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2571535Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2571863Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2572516Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2572953Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2573595Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2573955Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2574573Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2574995Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2575612Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2576115Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2576733Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2577170Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2577792Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2578244Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2578883Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2579373Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2580134Z W1204 10:09:22.422000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2580304Z PASSED [4.8341s] [ 55%] 2025-12-04T10:13:51.2581184Z inductor/test_compile_subprocess.py::GPUTests::test_low_memory_max_pool_dilation_2_dim_2_cuda <- test/inductor/test_torchinductor.py W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2581561Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2582251Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2582553Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2583287Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2583743Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2584356Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2584680Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2585324Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2585823Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2586459Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2586814Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2587512Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2587927Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2588549Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2588966Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2589590Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2590031Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2590641Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2591098Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2591805Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2592302Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2593061Z W1204 10:09:24.837000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2593468Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2593832Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2594521Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2594830Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2595479Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2595937Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2596545Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2596874Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2597520Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2597959Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2598679Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2599032Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2599659Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2600070Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2600692Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2601109Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2601724Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2602248Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2602862Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2603322Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2603952Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2604450Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2605206Z W1204 10:09:25.599000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2605299Z PASSED [1.5222s] [ 55%] 2025-12-04T10:13:51.2606192Z inductor/test_compile_subprocess.py::GPUTests::test_low_memory_max_pool_dilation_2_dim_3_cuda <- test/inductor/test_torchinductor.py W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2606561Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2607249Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2607557Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2608206Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2608668Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2609380Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2609709Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2610355Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2610803Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2611441Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2611794Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2612422Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2612836Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2613532Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2613943Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2614567Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2615009Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2615648Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2616133Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2616757Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2617251Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2618008Z W1204 10:09:26.371000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2618416Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2618785Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2619467Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2619776Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2620503Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2620971Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2621577Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2621900Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2622552Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2623061Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2623703Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2624053Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2625142Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2625560Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2626181Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2626597Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2627208Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2627658Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2628264Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2628715Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2629339Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2629824Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2630581Z W1204 10:09:28.836000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2630668Z PASSED [4.9686s] [ 56%] 2025-12-04T10:13:51.2631111Z inductor/test_compile_subprocess.py::GPUTests::test_masked_fill_cuda <- test/inductor/test_torchinductor.py PASSED [0.5910s] [ 56%] 2025-12-04T10:13:51.2631771Z inductor/test_compile_subprocess.py::GPUTests::test_masked_scatter_cuda <- test/inductor/test_torchinductor.py PASSED [0.5531s] [ 56%] 2025-12-04T10:13:51.2632580Z inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d1_cuda <- test/inductor/test_torchinductor.py W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2632944Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2633629Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2633935Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2634581Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2635041Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2635646Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2636074Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2636723Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2637158Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2637804Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2638156Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2638777Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2639191Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2639807Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2640229Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2640840Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2641277Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2641889Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2642341Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2643039Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2643529Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2644284Z W1204 10:09:32.471000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2644687Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2645053Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2645782Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2646087Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2646733Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2647262Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2647871Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2648194Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2648843Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2649277Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2649912Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2650265Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2650885Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2651307Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2651919Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2652337Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2652949Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2653385Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2654099Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2654550Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2655175Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2655664Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2656419Z W1204 10:09:33.275000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2656508Z PASSED [1.5823s] [ 56%] 2025-12-04T10:13:51.2657310Z inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d3_cuda <- test/inductor/test_torchinductor.py W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2657678Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2658435Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2658742Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2659390Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2659850Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2660457Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2660784Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2661429Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2661864Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2662506Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2662858Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2663657Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2664139Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2664870Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2665431Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2666163Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2666680Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2667403Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2667924Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2668676Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2669247Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2670152Z W1204 10:09:34.056000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2670690Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2671112Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2671929Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2672272Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2673045Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2673583Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2674309Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2674676Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2675446Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2675952Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2676711Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2677122Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2677933Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2678425Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2679160Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2679645Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2680374Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2680878Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2681611Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2682133Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2682955Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2683526Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2684433Z W1204 10:09:34.987000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2684522Z PASSED [2.2269s] [ 57%] 2025-12-04T10:13:51.2685487Z inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d4_cuda <- test/inductor/test_torchinductor.py W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2685960Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2686777Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2687124Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2687894Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2688423Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2689148Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2689516Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2690286Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2690872Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2691634Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2692036Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2692773Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2693256Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2693993Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2694474Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2695203Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2695747Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2696354Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2696800Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2697430Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2697915Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2698673Z W1204 10:09:36.285000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2699070Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2699431Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2700118Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2700418Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2701064Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2701534Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2702139Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2702536Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2703262Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2703697Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2704341Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2704691Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2705316Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2705777Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2706389Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2706884Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2707496Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2707943Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2708550Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2709006Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2709631Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2710119Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2710876Z W1204 10:09:37.218000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2710965Z PASSED [1.8327s] [ 57%] 2025-12-04T10:13:51.2711774Z inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d5_cuda <- test/inductor/test_torchinductor.py W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2712142Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2712831Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2713132Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2713855Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2714318Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2714927Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2715263Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2715906Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2716346Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2716986Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2717337Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2718032Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2718442Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2719066Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2719474Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2720088Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2720532Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2721137Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2721591Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2722214Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2722704Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2723455Z W1204 10:09:38.138000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2723851Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2724221Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2725333Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2725649Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2726290Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2726757Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2727362Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2727684Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2728332Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2728766Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2729511Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2729861Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2730479Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2730894Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2731506Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2731926Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2732537Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2732980Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2733590Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2734037Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2734668Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2735156Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2735990Z W1204 10:09:39.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims._low_memory_max_pool_with_offsets.default 2025-12-04T10:13:51.2736085Z PASSED [2.0354s] [ 57%] 2025-12-04T10:13:51.2736566Z inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d6_dilation_1_cuda <- test/inductor/test_torchinductor.py PASSED [2.5085s] [ 57%] 2025-12-04T10:13:51.2737075Z inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward2_cuda <- test/inductor/test_torchinductor.py PASSED [7.5414s] [ 57%] 2025-12-04T10:13:51.2737589Z inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward3_cuda <- test/inductor/test_torchinductor.py PASSED [3.4701s] [ 58%] 2025-12-04T10:13:51.2738104Z inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward4_cuda <- test/inductor/test_torchinductor.py PASSED [13.3843s] [ 58%] 2025-12-04T10:13:51.2738613Z inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward6_cuda <- test/inductor/test_torchinductor.py PASSED [0.2535s] [ 58%] 2025-12-04T10:13:51.2739119Z inductor/test_compile_subprocess.py::GPUTests::test_max_pool2d_with_indices_backward_cuda <- test/inductor/test_torchinductor.py PASSED [2.3973s] [ 58%] 2025-12-04T10:13:51.2739524Z inductor/test_compile_subprocess.py::GPUTests::test_mean_cuda <- test/inductor/test_torchinductor.py PASSED [0.6825s] [ 59%] 2025-12-04T10:13:51.2740078Z inductor/test_compile_subprocess.py::GPUTests::test_min_max_reduction_cuda <- test/inductor/test_torchinductor.py PASSED [1.0114s] [ 59%] 2025-12-04T10:13:51.2740568Z inductor/test_compile_subprocess.py::GPUTests::test_misaligned_address_issue1_cuda <- test/inductor/test_torchinductor.py PASSED [0.4157s] [ 59%] 2025-12-04T10:13:51.2741387Z inductor/test_compile_subprocess.py::GPUTests::test_mix_device_index_cuda <- test/inductor/test_torchinductor.py W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2741765Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2742445Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2742757Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2743468Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2743927Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2744542Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2744859Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2745516Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2745956Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2746597Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2747028Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2747645Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2748062Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2748678Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2749094Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2749714Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2750146Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2750758Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2751284Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2751909Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2752398Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2753061Z W1204 10:10:11.822000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.2753166Z PASSED [0.2172s] [ 59%] 2025-12-04T10:13:51.2754214Z inductor/test_compile_subprocess.py::GPUTests::test_mixed_mm3_cuda <- test/inductor/test_torchinductor.py W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2754600Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2755281Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2755595Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2756239Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2756702Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2757315Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2757635Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2758382Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2758820Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2759463Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2759822Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2760445Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2760856Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2761478Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2761899Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2762513Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2763033Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2763644Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2764096Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2764728Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2765221Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2765940Z W1204 10:10:12.025000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2766340Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2766717Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2767397Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2767697Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2768351Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2768807Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2769495Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2769816Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2770465Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2770904Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2771540Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2771897Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2772517Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2772933Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2773630Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2774048Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2774663Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2775105Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2775721Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2776172Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2776800Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2777288Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2778007Z W1204 10:10:12.243000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2778095Z PASSED [0.4317s] [ 60%] 2025-12-04T10:13:51.2778886Z inductor/test_compile_subprocess.py::GPUTests::test_mixed_mm_cuda <- test/inductor/test_torchinductor.py W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2779262Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2779949Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2780364Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2781012Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2781468Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2782087Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2782406Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2783138Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2783575Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2784217Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2784648Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2785267Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2785685Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2786303Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2786717Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2787335Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2787777Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2788387Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2788837Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2789468Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2789959Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2790679Z W1204 10:10:12.457000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2791080Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2791526Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2792208Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2792511Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2793165Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2793619Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2794238Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2794557Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2795200Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2796322Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2796954Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2797311Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2797932Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2798349Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2798973Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2799381Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2800000Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2800445Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2801060Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2801512Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2802141Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2802630Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2803417Z W1204 10:10:12.615000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2803514Z PASSED [0.3165s] [ 60%] 2025-12-04T10:13:51.2804320Z inductor/test_compile_subprocess.py::GPUTests::test_move_arange_cuda <- test/inductor/test_torchinductor.py W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2804702Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2805382Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2805689Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2806342Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2806796Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2807485Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2807808Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2808471Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2808908Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2809543Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2809906Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2810523Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2810940Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2811556Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2811970Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2812590Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2813029Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2813646Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2814259Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2814890Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2815382Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2816033Z W1204 10:10:12.774000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.2816121Z PASSED [0.1581s] [ 60%] 2025-12-04T10:13:51.2816947Z inductor/test_compile_subprocess.py::GPUTests::test_mul_index_expr_cuda <- test/inductor/test_torchinductor.py W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2817315Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2817998Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2818385Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2819029Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2819494Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2820107Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2820430Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2821085Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2821520Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2822163Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2822518Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2823211Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2823633Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2824251Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2825014Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2825824Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2826274Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2826888Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2827343Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2827974Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2828470Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2829132Z W1204 10:10:12.934000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.2829222Z PASSED [0.2856s] [ 60%] 2025-12-04T10:13:51.2830177Z inductor/test_compile_subprocess.py::GPUTests::test_mul_softmax_symfloat_cuda <- test/inductor/test_torchinductor.py W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2830550Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] Traceback (most recent call last): 2025-12-04T10:13:51.2831240Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2831557Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] ).serialize() 2025-12-04T10:13:51.2832206Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2832675Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2833282Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2833616Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] pickler.dump(obj) 2025-12-04T10:13:51.2834271Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2834710Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2835359Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2835754Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] cls(obj, pickler.options), 2025-12-04T10:13:51.2836384Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2836873Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2837488Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2837912Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2838525Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2838972Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2839591Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2840046Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2840670Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2841239Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2841955Z W1204 10:10:13.362000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2842359Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2842730Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] Traceback (most recent call last): 2025-12-04T10:13:51.2843412Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2843733Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] ).serialize() 2025-12-04T10:13:51.2844374Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2844832Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2845449Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2845773Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] pickler.dump(obj) 2025-12-04T10:13:51.2846429Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2846866Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2847591Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2847942Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] cls(obj, pickler.options), 2025-12-04T10:13:51.2848558Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2848977Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2849591Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2850008Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2850624Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2851057Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2851755Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2852202Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2852836Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2853328Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2854040Z W1204 10:10:14.435000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2854136Z PASSED [2.2287s] [ 60%] 2025-12-04T10:13:51.2854753Z inductor/test_compile_subprocess.py::GPUTests::test_multi_gpu_recompile_on_index_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (requires multiple cuda devices) [ 61%] 2025-12-04T10:13:51.2855212Z inductor/test_compile_subprocess.py::GPUTests::test_multi_threading_cuda <- test/inductor/test_torchinductor.py PASSED [0.1960s] [ 61%] 2025-12-04T10:13:51.2855683Z inductor/test_compile_subprocess.py::GPUTests::test_multilayer_prime_size_cuda <- test/inductor/test_torchinductor.py PASSED [0.5653s] [ 61%] 2025-12-04T10:13:51.2856139Z inductor/test_compile_subprocess.py::GPUTests::test_multilayer_var_cuda <- test/inductor/test_torchinductor.py PASSED [1.4421s] [ 61%] 2025-12-04T10:13:51.2856666Z inductor/test_compile_subprocess.py::GPUTests::test_nan_sort_stable_True_descending_False_cuda <- test/inductor/test_torchinductor.py PASSED [0.9291s] [ 62%] 2025-12-04T10:13:51.2857181Z inductor/test_compile_subprocess.py::GPUTests::test_nan_sort_stable_True_descending_True_cuda <- test/inductor/test_torchinductor.py PASSED [1.0221s] [ 62%] 2025-12-04T10:13:51.2858054Z inductor/test_compile_subprocess.py::GPUTests::test_needs_contiguous_strides_cuda <- test/inductor/test_torchinductor.py W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2858421Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2859194Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2859498Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2860148Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2860614Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2861220Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2861555Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2862196Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2862640Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2863440Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2863800Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2864422Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2864833Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2865455Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2865873Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2866492Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2866930Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2867545Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2867992Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2868616Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2869111Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2869837Z W1204 10:10:19.609000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.mylib.myop.default 2025-12-04T10:13:51.2869932Z PASSED [0.6721s] [ 62%] 2025-12-04T10:13:51.2870727Z inductor/test_compile_subprocess.py::GPUTests::test_neg_index_cuda <- test/inductor/test_torchinductor.py W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2871095Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2871778Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2872080Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2872735Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2873189Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2873799Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2874196Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2874841Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2875290Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2875924Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2876285Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2876901Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2877314Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2877930Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2878340Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2878959Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2879395Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2880006Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2880452Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2881152Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2881643Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2882295Z W1204 10:10:21.019000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.2882706Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2883068Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2883756Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2884059Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2884698Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2885240Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2885842Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2886170Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2886810Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2887252Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2887888Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2888236Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2888858Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2889266Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2889888Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2890299Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2890930Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2897881Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2898684Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2899147Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2899779Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2900275Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2900928Z W1204 10:10:21.232000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.2901336Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2901698Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2902374Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2902767Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2903491Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2903959Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2904564Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2904889Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2905535Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2905974Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2906623Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2906974Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2907598Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2908013Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2908632Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2909125Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2909741Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2910183Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2910798Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2911255Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2911880Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2912377Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2913022Z W1204 10:10:21.468000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.2913192Z PASSED [1.9492s] [ 62%] 2025-12-04T10:13:51.2913637Z inductor/test_compile_subprocess.py::GPUTests::test_new_empty_cuda <- test/inductor/test_torchinductor.py PASSED [0.2079s] [ 63%] 2025-12-04T10:13:51.2914091Z inductor/test_compile_subprocess.py::GPUTests::test_new_empty_strided_cuda <- test/inductor/test_torchinductor.py PASSED [0.2060s] [ 63%] 2025-12-04T10:13:51.2914514Z inductor/test_compile_subprocess.py::GPUTests::test_new_ones_cuda <- test/inductor/test_torchinductor.py PASSED [0.2904s] [ 63%] 2025-12-04T10:13:51.2915015Z inductor/test_compile_subprocess.py::GPUTests::test_no_mega_fusion_during_lowering_cuda <- test/inductor/test_torchinductor.py PASSED [0.7299s] [ 63%] 2025-12-04T10:13:51.2915460Z inductor/test_compile_subprocess.py::GPUTests::test_no_op_reduction_cuda <- test/inductor/test_torchinductor.py PASSED [0.3582s] [ 63%] 2025-12-04T10:13:51.2916311Z inductor/test_compile_subprocess.py::GPUTests::test_norm_constant_overflow_cuda <- test/inductor/test_torchinductor.py W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2916681Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2917373Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2917687Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2918339Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2918795Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2919409Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2919733Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2920452Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2920896Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2921530Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2921886Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2922510Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2922921Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2923546Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2923956Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2924996Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2925453Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2926109Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2926567Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2927189Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2927688Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2928397Z W1204 10:10:24.355000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2928495Z PASSED [0.6226s] [ 64%] 2025-12-04T10:13:51.2929282Z inductor/test_compile_subprocess.py::GPUTests::test_one_hot_cuda <- test/inductor/test_torchinductor.py W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2929648Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2930334Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2930637Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2931288Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2931884Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2932501Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2932820Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2933468Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2933911Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2934549Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2934907Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2935525Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2936143Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2936764Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2937175Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2937804Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2938246Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2938862Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2939313Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2939933Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2940429Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2941074Z W1204 10:10:24.665000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.2941174Z PASSED [0.2226s] [ 64%] 2025-12-04T10:13:51.2941604Z inductor/test_compile_subprocess.py::GPUTests::test_pad_single_cuda <- test/inductor/test_torchinductor.py PASSED [0.3446s] [ 64%] 2025-12-04T10:13:51.2942466Z inductor/test_compile_subprocess.py::GPUTests::test_pattern_matcher_unbacked_cuda <- test/inductor/test_torchinductor.py W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2942911Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2943668Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2943974Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2944624Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2945086Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2945695Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2946013Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2946667Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2947183Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2947823Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2948171Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2948799Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2949210Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2949824Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2950245Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2950857Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2951302Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2951914Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2952369Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2952998Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2953488Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2954281Z W1204 10:10:25.293000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2954372Z PASSED [0.4813s] [ 64%] 2025-12-04T10:13:51.2954806Z inductor/test_compile_subprocess.py::GPUTests::test_permute1_cuda <- test/inductor/test_torchinductor.py PASSED [0.4116s] [ 65%] 2025-12-04T10:13:51.2955609Z inductor/test_compile_subprocess.py::GPUTests::test_philox_rand_cuda <- test/inductor/test_torchinductor.py W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2955985Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2956671Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2956976Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2957638Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2958173Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2958783Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2959103Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2959751Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2960193Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2960831Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2961198Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2961814Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2962236Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2962850Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2963261Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2963889Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2964327Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2965020Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2965498Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2966155Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2966651Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2967342Z W1204 10:10:26.143000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.rngprims.philox_rand.default 2025-12-04T10:13:51.2967758Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2968120Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] Traceback (most recent call last): 2025-12-04T10:13:51.2968807Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2969189Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] ).serialize() 2025-12-04T10:13:51.2969838Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2970293Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2970903Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2971229Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] pickler.dump(obj) 2025-12-04T10:13:51.2971867Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2972311Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2972948Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2973299Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] cls(obj, pickler.options), 2025-12-04T10:13:51.2973919Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2974337Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2974951Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2975366Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2976062Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2976509Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2977116Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2977567Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2978193Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2978684Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2979375Z W1204 10:10:26.819000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/1] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.rngprims.philox_rand.default 2025-12-04T10:13:51.2979466Z PASSED [1.3570s] [ 65%] 2025-12-04T10:13:51.2979932Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_j0_cuda <- test/inductor/test_torchinductor.py PASSED [0.8158s] [ 65%] 2025-12-04T10:13:51.2980505Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_j1_cuda <- test/inductor/test_torchinductor.py PASSED [0.3520s] [ 65%] 2025-12-04T10:13:51.2980963Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_y0_cuda <- test/inductor/test_torchinductor.py PASSED [0.3059s] [ 66%] 2025-12-04T10:13:51.2981429Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_bessel_y1_cuda <- test/inductor/test_torchinductor.py PASSED [0.9545s] [ 66%] 2025-12-04T10:13:51.2981952Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_chebyshev_polynomial_t_cuda <- test/inductor/test_torchinductor.py PASSED [0.4289s] [ 66%] 2025-12-04T10:13:51.2982470Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_chebyshev_polynomial_u_cuda <- test/inductor/test_torchinductor.py PASSED [0.1088s] [ 66%] 2025-12-04T10:13:51.2982983Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_digamma_cuda <- test/inductor/test_torchinductor.py PASSED [0.5696s] [ 66%] 2025-12-04T10:13:51.2983811Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_entr_cuda <- test/inductor/test_torchinductor.py W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.2984183Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.2984868Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.2985176Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.2985822Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.2986288Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.2986896Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.2987296Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.2987944Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.2988378Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.2989025Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.2989373Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.2989992Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.2990410Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.2991024Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.2991518Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.2992131Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.2992576Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.2993183Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.2993632Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.2994267Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.2994758Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.2995473Z W1204 10:10:31.466000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.2995563Z PASSED [0.6723s] [ 67%] 2025-12-04T10:13:51.2996014Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_erfc_cuda <- test/inductor/test_torchinductor.py PASSED [1.0054s] [ 67%] 2025-12-04T10:13:51.2996468Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_erfinv_cuda <- test/inductor/test_torchinductor.py PASSED [0.6153s] [ 67%] 2025-12-04T10:13:51.2996916Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_exp2_cuda <- test/inductor/test_torchinductor.py PASSED [0.3907s] [ 67%] 2025-12-04T10:13:51.2997366Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_expm1_cuda <- test/inductor/test_torchinductor.py PASSED [0.4082s] [ 68%] 2025-12-04T10:13:51.2997825Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_gammainc_cuda <- test/inductor/test_torchinductor.py PASSED [0.1206s] [ 68%] 2025-12-04T10:13:51.2998364Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_gammaln_cuda <- test/inductor/test_torchinductor.py PASSED [0.4993s] [ 68%] 2025-12-04T10:13:51.2998870Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_hermite_polynomial_h_cuda <- test/inductor/test_torchinductor.py PASSED [0.2674s] [ 68%] 2025-12-04T10:13:51.2999378Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_laguerre_polynomial_l_cuda <- test/inductor/test_torchinductor.py PASSED [0.1105s] [ 69%] 2025-12-04T10:13:51.2999900Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_legendre_polynomial_p_cuda <- test/inductor/test_torchinductor.py PASSED [0.2470s] [ 69%] 2025-12-04T10:13:51.3000346Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_log1p_cuda <- test/inductor/test_torchinductor.py PASSED [0.4104s] [ 69%] 2025-12-04T10:13:51.3000848Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_i0_cuda <- test/inductor/test_torchinductor.py PASSED [0.4343s] [ 69%] 2025-12-04T10:13:51.3001343Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_i1_cuda <- test/inductor/test_torchinductor.py PASSED [0.4434s] [ 69%] 2025-12-04T10:13:51.3001832Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_k0_cuda <- test/inductor/test_torchinductor.py PASSED [0.5541s] [ 70%] 2025-12-04T10:13:51.3002329Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_modified_bessel_k1_cuda <- test/inductor/test_torchinductor.py PASSED [0.1087s] [ 70%] 2025-12-04T10:13:51.3002850Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_ndtri_cuda <- test/inductor/test_torchinductor.py PASSED [0.3922s] [ 70%] 2025-12-04T10:13:51.3003324Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_polygamma_cuda <- test/inductor/test_torchinductor.py PASSED [0.2141s] [ 70%] 2025-12-04T10:13:51.3003768Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_round_cuda <- test/inductor/test_torchinductor.py PASSED [0.4102s] [ 71%] 2025-12-04T10:13:51.3004292Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_scaled_modified_bessel_k0_cuda <- test/inductor/test_torchinductor.py PASSED [0.4884s] [ 71%] 2025-12-04T10:13:51.3004815Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_scaled_modified_bessel_k1_cuda <- test/inductor/test_torchinductor.py PASSED [0.1092s] [ 71%] 2025-12-04T10:13:51.3005632Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_sinc_cuda <- test/inductor/test_torchinductor.py W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3006012Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3006693Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3007010Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3007652Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3008107Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3008725Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3009043Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3009770Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3010206Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3010846Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3011198Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3011816Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3012234Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3012846Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3013262Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3013960Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3014401Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3015014Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3015462Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3016092Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3016586Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3017306Z W1204 10:10:39.445000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3017399Z PASSED [0.7272s] [ 71%] 2025-12-04T10:13:51.3017905Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_spherical_bessel_j0_cuda <- test/inductor/test_torchinductor.py PASSED [0.1109s] [ 72%] 2025-12-04T10:13:51.3018738Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_xlog1py_cuda <- test/inductor/test_torchinductor.py W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3019102Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3019795Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3020094Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3020846Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3021302Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3021907Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3022239Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3022881Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3023383Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3024016Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3024642Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3025410Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3025820Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3026441Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3026857Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3027474Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3027913Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3028528Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3028976Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3029602Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3030094Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3030810Z W1204 10:10:40.045000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3030903Z PASSED [0.4748s] [ 72%] 2025-12-04T10:13:51.3031722Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_xlogy_cuda <- test/inductor/test_torchinductor.py W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3032197Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3032878Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3033183Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3033835Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3034287Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3034900Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3035218Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3035860Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3036380Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3037014Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3037368Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3037987Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3038401Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3039019Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3039431Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3040048Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3040486Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3041099Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3041549Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3042180Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3042670Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3043460Z W1204 10:10:40.521000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3043557Z PASSED [0.4728s] [ 72%] 2025-12-04T10:13:51.3044003Z inductor/test_compile_subprocess.py::GPUTests::test_pointwise_zeta_cuda <- test/inductor/test_torchinductor.py PASSED [1.4412s] [ 72%] 2025-12-04T10:13:51.3044418Z inductor/test_compile_subprocess.py::GPUTests::test_pow3_cuda <- test/inductor/test_torchinductor.py PASSED [0.2835s] [ 72%] 2025-12-04T10:13:51.3044831Z inductor/test_compile_subprocess.py::GPUTests::test_pow_int_cuda <- test/inductor/test_torchinductor.py PASSED [1.3640s] [ 73%] 2025-12-04T10:13:51.3045635Z inductor/test_compile_subprocess.py::GPUTests::test_pow_symfloat_cuda <- test/inductor/test_torchinductor.py W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3046008Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3046688Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3046997Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3047718Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3048176Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3048788Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3049107Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3049754Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3050195Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3050836Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3051190Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3051813Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3052221Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3052842Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3053262Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3053951Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3054395Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3055004Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3055457Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3056086Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3056574Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3057300Z W1204 10:10:43.863000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3057388Z PASSED [0.4672s] [ 73%] 2025-12-04T10:13:51.3057803Z inductor/test_compile_subprocess.py::GPUTests::test_prod_cuda <- test/inductor/test_torchinductor.py PASSED [1.9405s] [ 73%] 2025-12-04T10:13:51.3058371Z inductor/test_compile_subprocess.py::GPUTests::test_profiler_mark_wrapper_call_cuda <- test/inductor/test_torchinductor.py PASSED [0.1776s] [ 73%] 2025-12-04T10:13:51.3059222Z inductor/test_compile_subprocess.py::GPUTests::test_rand_like_deterministic_cuda <- test/inductor/test_torchinductor.py W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3059598Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3060280Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3060591Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3061237Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3061703Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3062318Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3062638Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3063368Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3063809Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3064447Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3064795Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3065516Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3065934Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3066546Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3066967Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3067581Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3068026Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3068637Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3069088Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3069794Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3070282Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3070979Z W1204 10:10:46.431000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.inductor_seeds.default 2025-12-04T10:13:51.3071066Z PASSED [0.4551s] [ 74%] 2025-12-04T10:13:51.3071865Z inductor/test_compile_subprocess.py::GPUTests::test_randint_cuda <- test/inductor/test_torchinductor.py W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3072234Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3072913Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3073217Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3073864Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3074324Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3074935Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3075257Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3075900Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3076414Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3077054Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3077407Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3078027Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3078437Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3079053Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3079466Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3080078Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3080599Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3081209Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3081665Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3082286Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3082772Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3083466Z W1204 10:10:46.903000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.inductor_seeds.default 2025-12-04T10:13:51.3083555Z PASSED [0.5137s] [ 74%] 2025-12-04T10:13:51.3084384Z inductor/test_compile_subprocess.py::GPUTests::test_randn_generator_cuda <- test/inductor/test_torchinductor.py W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3084747Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3085435Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3085740Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3086392Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3086850Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3087543Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3087865Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3088515Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3088953Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3089586Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3089946Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3090563Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3090980Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3091671Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3092082Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3092707Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3093142Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3093758Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3094210Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3094837Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3095329Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3096011Z W1204 10:10:47.399000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.inductor_seeds.default 2025-12-04T10:13:51.3096105Z PASSED [0.5339s] [ 74%] 2025-12-04T10:13:51.3096540Z inductor/test_compile_subprocess.py::GPUTests::test_reduction1_cuda <- test/inductor/test_torchinductor.py PASSED [0.4666s] [ 74%] 2025-12-04T10:13:51.3096981Z inductor/test_compile_subprocess.py::GPUTests::test_reduction3_cuda <- test/inductor/test_torchinductor.py PASSED [0.4553s] [ 75%] 2025-12-04T10:13:51.3097453Z inductor/test_compile_subprocess.py::GPUTests::test_reduction_config_limit_cuda <- test/inductor/test_torchinductor.py PASSED [0.0031s] [ 75%] 2025-12-04T10:13:51.3098404Z inductor/test_compile_subprocess.py::GPUTests::test_reflection_pad2d_backward_cuda <- test/inductor/test_torchinductor.py W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3098823Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3099638Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3099993Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3100763Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3101306Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3102032Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3102396Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3103272Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3103706Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3104351Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3104699Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3105318Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3105730Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3106343Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3106760Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3107378Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3107816Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3108424Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3108883Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3109503Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3110156Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3110811Z W1204 10:10:48.913000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3111209Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3111578Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3112257Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3112568Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3113209Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3113665Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3114356Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3114672Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3115324Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3115759Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3116391Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3116758Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3117372Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3117788Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3118406Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3118822Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3119435Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3119873Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3120485Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3121013Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3121641Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3122127Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3122778Z W1204 10:10:49.115000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3123177Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3123542Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3124223Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3124824Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3125627Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3126080Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3126695Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3127014Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3127655Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3128100Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3128735Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3129091Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3129711Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3130123Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3130746Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3131153Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3131778Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3132323Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3132941Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3133395Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3134020Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3134514Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3135164Z W1204 10:10:49.381000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3135566Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3135928Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3136693Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3136993Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3137641Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3138103Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3138707Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3139036Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3139677Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3140116Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3140753Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3141103Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3141730Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3142137Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3142753Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3143325Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3143944Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3144384Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3144991Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3145447Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3146072Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3146564Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3147345Z W1204 10:10:50.012000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3147742Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3148112Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3148792Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3149100Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3149741Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3150207Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3150812Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3151134Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3151779Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3152217Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3152860Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3153208Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3153904Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3154314Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3154925Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3155341Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3155951Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3156392Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3157003Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3157455Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3158079Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3158644Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3159293Z W1204 10:10:50.590000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3159694Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3160061Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3160739Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3161043Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3161690Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3162145Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3162760Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3163078Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3163727Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3164161Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3164867Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3165225Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3165843Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3166259Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3166872Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3167285Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3167901Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3168339Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3169030Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3169478Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3170104Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3170595Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3171249Z W1204 10:10:51.213000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3171649Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3172013Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3172695Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3172996Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3173641Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3174093Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3174700Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3175023Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3175742Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3176185Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3176819Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3177181Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3177794Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3178201Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3178825Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3179233Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3179928Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3180361Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3180971Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3181428Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3182049Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3182546Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3183257Z W1204 10:10:51.745000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3183660Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3184024Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3184706Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3185006Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3185656Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3186116Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3186823Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3187149Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3187790Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3188231Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3188868Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3189223Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3189843Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3190250Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3190944Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3191353Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3191972Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3192414Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3193022Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3193486Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3194110Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3194602Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3195247Z W1204 10:10:52.047000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3195644Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3196014Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3196689Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3196994Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3197713Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3198173Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3198776Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3199097Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3199741Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3200181Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3200821Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3201169Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3201860Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3202275Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3202890Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3203305Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3203915Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3204365Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3204971Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3205416Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3206045Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3206531Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3207183Z W1204 10:10:52.338000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3207577Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3207942Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3208698Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3209000Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3209644Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3210099Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3210712Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3211034Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3211679Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3212113Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3212821Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3213174Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3213791Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3214206Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3214818Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3215230Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3215845Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3216278Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3216893Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3217338Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3217972Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3218457Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3219179Z W1204 10:10:52.626000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3219583Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3219942Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3220619Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3220923Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3221568Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3222025Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3222626Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3223003Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3223721Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3224159Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3225561Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3225977Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3226593Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3227007Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3227625Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3228035Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3228653Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3229087Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3229700Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3230147Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3230935Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3231431Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3232076Z W1204 10:10:52.924000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3232482Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3232843Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3233519Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3233831Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3234471Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3234930Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3235644Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3235967Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3236612Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3237045Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3237680Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3238037Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3238653Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3239063Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3239679Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3240087Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3240708Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3241149Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3241833Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3242288Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3242908Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3243403Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3244047Z W1204 10:10:53.237000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3244443Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3244815Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3245489Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3245919Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3246561Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3247013Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3247625Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3247945Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3248590Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3249030Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3249665Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3250016Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3250628Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3251042Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3251657Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3252070Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3252759Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3253197Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3253804Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3254254Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3254879Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3255368Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3256017Z W1204 10:10:53.528000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3256413Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3256859Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3257535Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3257835Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3258485Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3258937Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3259543Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3259866Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3260508Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3260950Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3261582Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3261936Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3262555Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3263044Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3263737Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3264144Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3264759Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3265197Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3265809Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3266259Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3266884Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3267369Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3268117Z W1204 10:10:54.229000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3268209Z PASSED [5.6322s] [ 75%] 2025-12-04T10:13:51.3269037Z inductor/test_compile_subprocess.py::GPUTests::test_reflection_pad2d_cuda <- test/inductor/test_torchinductor.py W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3269412Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3270088Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3270400Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3271040Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3271494Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3272104Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3272421Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3273064Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3273502Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3274131Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3278815Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3279485Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3279912Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3280542Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3280966Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3281590Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3282028Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3282646Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3283174Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3283809Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3284311Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3284970Z W1204 10:10:54.523000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3285371Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3285742Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3286431Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3286738Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3287401Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3287865Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3288480Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3288811Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3289459Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3290057Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3290697Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3291057Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3291685Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3292102Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3292725Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3293142Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3293767Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3294282Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3294898Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3295353Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3295982Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3296472Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3297189Z W1204 10:10:54.968000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3297289Z PASSED [0.7402s] [ 75%] 2025-12-04T10:13:51.3297779Z inductor/test_compile_subprocess.py::GPUTests::test_reinterpret_dtypeview_cuda <- test/inductor/test_torchinductor.py PASSED [0.2483s] [ 75%] 2025-12-04T10:13:51.3298198Z inductor/test_compile_subprocess.py::GPUTests::test_relu_cuda <- test/inductor/test_torchinductor.py PASSED [0.3767s] [ 76%] 2025-12-04T10:13:51.3299017Z inductor/test_compile_subprocess.py::GPUTests::test_remove_no_ops_cuda <- test/inductor/test_torchinductor.py W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3299385Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3300082Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3300383Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3301115Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3301575Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3302195Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3302520Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3303243Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3303687Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3304336Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3304696Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3305315Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3305818Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3306448Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3306865Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3307487Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3307926Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3308547Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3308998Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3309627Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3310121Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3310830Z W1204 10:10:56.208000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3311248Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3311613Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3312376Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3312682Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3313327Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3313792Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3314400Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3314729Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3315380Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3315828Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3316567Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3316919Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3317543Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3317963Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3318592Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3319009Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3319634Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3320071Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3320684Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3321141Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3321764Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3322262Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3322974Z W1204 10:10:56.481000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3323462Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3323826Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3324838Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3325159Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3325804Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3326271Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3326880Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3327199Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3328112Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3328624Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3329393Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3329797Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3330538Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3331021Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3331755Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3332239Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3332977Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3333491Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3334222Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3334754Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3335500Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3336181Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3337040Z W1204 10:10:57.006000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3337508Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3337933Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3338750Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3339105Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3339887Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3340421Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3341230Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3341599Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3342380Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3342890Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3343600Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3343962Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3344583Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3345005Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3345625Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3346043Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3346668Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3347105Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3347719Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3348253Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3348882Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3349377Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3350095Z W1204 10:10:57.271000 60058 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3350186Z PASSED [1.9195s] [ 76%] 2025-12-04T10:13:51.3350717Z inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_slice_cuda <- test/inductor/test_torchinductor.py ('RERUN', {'yellow': True}) [0.3790s] [ 76%] 2025-12-04T10:13:51.3351250Z inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_slice_cuda <- test/inductor/test_torchinductor.py ('RERUN', {'yellow': True}) [0.2404s] [ 76%] 2025-12-04T10:13:51.3351700Z inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_slice_cuda <- test/inductor/test_torchinductor.py FAILED [0.2327s] [ 76%] 2025-12-04T10:13:51.3351787Z 2025-12-04T10:13:51.3351919Z ==================================== RERUNS ==================================== 2025-12-04T10:13:51.3352118Z _____________________ GPUTests.test_remove_noop_slice_cuda _____________________ 2025-12-04T10:13:51.3352229Z Traceback (most recent call last): 2025-12-04T10:13:51.3352603Z File "/var/lib/jenkins/workspace/test/inductor/test_torchinductor.py", line 14842, in new_test 2025-12-04T10:13:51.3352695Z return value(self) 2025-12-04T10:13:51.3353136Z File "/var/lib/jenkins/workspace/test/inductor/test_torchinductor.py", line 6708, in test_remove_noop_slice 2025-12-04T10:13:51.3353253Z self.assertExpectedInline( 2025-12-04T10:13:51.3353806Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3272, in assertExpectedInline 2025-12-04T10:13:51.3354162Z return super().assertExpectedInline(actual if isinstance(actual, str) else str(actual), expect, skip + 1) 2025-12-04T10:13:51.3354562Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/expecttest/__init__.py", line 413, in assertExpectedInline 2025-12-04T10:13:51.3354660Z assert_expected_inline( 2025-12-04T10:13:51.3355057Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/expecttest/__init__.py", line 378, in assert_expected_inline 2025-12-04T10:13:51.3355184Z assert_eq(expect, actual, msg=help_text) 2025-12-04T10:13:51.3355621Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/expecttest/__init__.py", line 450, in assertMultiLineEqualMaybeCppStack 2025-12-04T10:13:51.3355810Z self.assertMultiLineEqual(expect, actual, *args, **kwargs) 2025-12-04T10:13:51.3356127Z File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/case.py", line 1226, in assertMultiLineEqual 2025-12-04T10:13:51.3356277Z self.fail(self._formatMessage(msg, standardMsg)) 2025-12-04T10:13:51.3356519Z File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/case.py", line 675, in fail 2025-12-04T10:13:51.3356637Z raise self.failureException(msg) 2025-12-04T10:13:51.3356867Z AssertionError: 'def forward(self, arg0_1: "Sym(s77)", arg[333 chars]_9,)' != '' 2025-12-04T10:13:51.3357186Z - def forward(self, arg0_1: "Sym(s77)", arg1_1: "Sym(s27)", arg2_1: "Sym(s53)", arg3_1: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0"): 2025-12-04T10:13:51.3357451Z - add: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0" = torch.ops.aten.add.Tensor(arg3_1, 1); arg3_1 = None 2025-12-04T10:13:51.3357706Z - add_9: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0" = torch.ops.aten.add.Tensor(add, 1); add = None 2025-12-04T10:13:51.3358290Z - return (add_9,) : To accept the new output, re-run test with envvar EXPECTTEST_ACCEPT=1 (we recommend staging/committing your changes before doing this) 2025-12-04T10:13:51.3358296Z 2025-12-04T10:13:51.3358485Z To execute this test, run the following from the base repo dir: 2025-12-04T10:13:51.3359013Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 PYTORCH_TEST_WITH_SLOW_GRADCHECK=1 python test/inductor/test_compile_subprocess.py GPUTests.test_remove_noop_slice_cuda 2025-12-04T10:13:51.3359022Z 2025-12-04T10:13:51.3359250Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:13:51.3359439Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:13:51.3359537Z frames [('total', 1), ('ok', 1)] 2025-12-04T10:13:51.3359673Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:13:51.3359938Z aot_autograd [('total', 1), ('autograd_cache_miss', 1), ('autograd_cache_saved', 1), ('ok', 1)] 2025-12-04T10:13:51.3360370Z inductor [('triton_bundler_save_kernel', 8), ('fxgraph_cache_miss', 1), ('async_compile_cache_miss', 1), ('triton_bundler_save_static_autotuner', 1)] 2025-12-04T10:13:51.3360561Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:13:51.3361143Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__getstate__, please add missing op schema 2025-12-04T10:13:51.3361456Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3362033Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__setstate__, please add missing op schema 2025-12-04T10:13:51.3362266Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3362471Z _____________________ GPUTests.test_remove_noop_slice_cuda _____________________ 2025-12-04T10:13:51.3362582Z Traceback (most recent call last): 2025-12-04T10:13:51.3362911Z File "/var/lib/jenkins/workspace/test/inductor/test_torchinductor.py", line 14842, in new_test 2025-12-04T10:13:51.3363006Z return value(self) 2025-12-04T10:13:51.3363388Z File "/var/lib/jenkins/workspace/test/inductor/test_torchinductor.py", line 6708, in test_remove_noop_slice 2025-12-04T10:13:51.3363498Z self.assertExpectedInline( 2025-12-04T10:13:51.3363977Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3272, in assertExpectedInline 2025-12-04T10:13:51.3364329Z return super().assertExpectedInline(actual if isinstance(actual, str) else str(actual), expect, skip + 1) 2025-12-04T10:13:51.3364726Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/expecttest/__init__.py", line 413, in assertExpectedInline 2025-12-04T10:13:51.3364827Z assert_expected_inline( 2025-12-04T10:13:51.3365224Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/expecttest/__init__.py", line 378, in assert_expected_inline 2025-12-04T10:13:51.3365350Z assert_eq(expect, actual, msg=help_text) 2025-12-04T10:13:51.3365789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/expecttest/__init__.py", line 450, in assertMultiLineEqualMaybeCppStack 2025-12-04T10:13:51.3365975Z self.assertMultiLineEqual(expect, actual, *args, **kwargs) 2025-12-04T10:13:51.3366291Z File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/case.py", line 1226, in assertMultiLineEqual 2025-12-04T10:13:51.3366442Z self.fail(self._formatMessage(msg, standardMsg)) 2025-12-04T10:13:51.3366683Z File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/case.py", line 675, in fail 2025-12-04T10:13:51.3366797Z raise self.failureException(msg) 2025-12-04T10:13:51.3367029Z AssertionError: 'def forward(self, arg0_1: "Sym(s77)", arg[333 chars]_9,)' != '' 2025-12-04T10:13:51.3367349Z - def forward(self, arg0_1: "Sym(s77)", arg1_1: "Sym(s27)", arg2_1: "Sym(s53)", arg3_1: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0"): 2025-12-04T10:13:51.3367692Z - add: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0" = torch.ops.aten.add.Tensor(arg3_1, 1); arg3_1 = None 2025-12-04T10:13:51.3367944Z - add_9: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0" = torch.ops.aten.add.Tensor(add, 1); add = None 2025-12-04T10:13:51.3368418Z - return (add_9,) : To accept the new output, re-run test with envvar EXPECTTEST_ACCEPT=1 (we recommend staging/committing your changes before doing this) 2025-12-04T10:13:51.3368428Z 2025-12-04T10:13:51.3368614Z To execute this test, run the following from the base repo dir: 2025-12-04T10:13:51.3369141Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 PYTORCH_TEST_WITH_SLOW_GRADCHECK=1 python test/inductor/test_compile_subprocess.py GPUTests.test_remove_noop_slice_cuda 2025-12-04T10:13:51.3369146Z 2025-12-04T10:13:51.3369371Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:13:51.3369555Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:13:51.3369659Z frames [('total', 1), ('ok', 1)] 2025-12-04T10:13:51.3369797Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:13:51.3370060Z aot_autograd [('total', 1), ('autograd_cache_miss', 1), ('autograd_cache_saved', 1), ('ok', 1)] 2025-12-04T10:13:51.3370486Z inductor [('triton_bundler_save_kernel', 8), ('fxgraph_cache_miss', 1), ('async_compile_cache_miss', 1), ('triton_bundler_save_static_autotuner', 1)] 2025-12-04T10:13:51.3370752Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:13:51.3371332Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__getstate__, please add missing op schema 2025-12-04T10:13:51.3371564Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3372148Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__setstate__, please add missing op schema 2025-12-04T10:13:51.3372379Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3372557Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:13:51.3372663Z frames [('total', 1), ('ok', 1)] 2025-12-04T10:13:51.3372797Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:13:51.3373061Z aot_autograd [('total', 1), ('autograd_cache_miss', 1), ('autograd_cache_saved', 1), ('ok', 1)] 2025-12-04T10:13:51.3373489Z inductor [('triton_bundler_save_kernel', 8), ('fxgraph_cache_miss', 1), ('async_compile_cache_miss', 1), ('triton_bundler_save_static_autotuner', 1)] 2025-12-04T10:13:51.3373665Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:13:51.3374242Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__getstate__, please add missing op schema 2025-12-04T10:13:51.3374479Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3375049Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__setstate__, please add missing op schema 2025-12-04T10:13:51.3375280Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3375409Z =================================== FAILURES =================================== 2025-12-04T10:13:51.3375608Z _____________________ GPUTests.test_remove_noop_slice_cuda _____________________ 2025-12-04T10:13:51.3375716Z Traceback (most recent call last): 2025-12-04T10:13:51.3376048Z File "/var/lib/jenkins/workspace/test/inductor/test_torchinductor.py", line 14842, in new_test 2025-12-04T10:13:51.3376140Z return value(self) 2025-12-04T10:13:51.3376522Z File "/var/lib/jenkins/workspace/test/inductor/test_torchinductor.py", line 6708, in test_remove_noop_slice 2025-12-04T10:13:51.3376638Z self.assertExpectedInline( 2025-12-04T10:13:51.3377185Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3272, in assertExpectedInline 2025-12-04T10:13:51.3377538Z return super().assertExpectedInline(actual if isinstance(actual, str) else str(actual), expect, skip + 1) 2025-12-04T10:13:51.3377940Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/expecttest/__init__.py", line 413, in assertExpectedInline 2025-12-04T10:13:51.3378041Z assert_expected_inline( 2025-12-04T10:13:51.3378440Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/expecttest/__init__.py", line 378, in assert_expected_inline 2025-12-04T10:13:51.3378560Z assert_eq(expect, actual, msg=help_text) 2025-12-04T10:13:51.3378996Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/expecttest/__init__.py", line 450, in assertMultiLineEqualMaybeCppStack 2025-12-04T10:13:51.3379187Z self.assertMultiLineEqual(expect, actual, *args, **kwargs) 2025-12-04T10:13:51.3379501Z File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/case.py", line 1226, in assertMultiLineEqual 2025-12-04T10:13:51.3379647Z self.fail(self._formatMessage(msg, standardMsg)) 2025-12-04T10:13:51.3379891Z File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/case.py", line 675, in fail 2025-12-04T10:13:51.3380003Z raise self.failureException(msg) 2025-12-04T10:13:51.3380232Z AssertionError: 'def forward(self, arg0_1: "Sym(s77)", arg[333 chars]_9,)' != '' 2025-12-04T10:13:51.3380630Z - def forward(self, arg0_1: "Sym(s77)", arg1_1: "Sym(s27)", arg2_1: "Sym(s53)", arg3_1: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0"): 2025-12-04T10:13:51.3380890Z - add: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0" = torch.ops.aten.add.Tensor(arg3_1, 1); arg3_1 = None 2025-12-04T10:13:51.3381140Z - add_9: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0" = torch.ops.aten.add.Tensor(add, 1); add = None 2025-12-04T10:13:51.3381620Z - return (add_9,) : To accept the new output, re-run test with envvar EXPECTTEST_ACCEPT=1 (we recommend staging/committing your changes before doing this) 2025-12-04T10:13:51.3381625Z 2025-12-04T10:13:51.3381815Z To execute this test, run the following from the base repo dir: 2025-12-04T10:13:51.3382336Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 PYTORCH_TEST_WITH_SLOW_GRADCHECK=1 python test/inductor/test_compile_subprocess.py GPUTests.test_remove_noop_slice_cuda 2025-12-04T10:13:51.3382346Z 2025-12-04T10:13:51.3382571Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:13:51.3382755Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:13:51.3382851Z frames [('total', 1), ('ok', 1)] 2025-12-04T10:13:51.3383046Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:13:51.3383302Z aot_autograd [('total', 1), ('autograd_cache_miss', 1), ('autograd_cache_saved', 1), ('ok', 1)] 2025-12-04T10:13:51.3383732Z inductor [('triton_bundler_save_kernel', 8), ('fxgraph_cache_miss', 1), ('async_compile_cache_miss', 1), ('triton_bundler_save_static_autotuner', 1)] 2025-12-04T10:13:51.3383914Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:13:51.3384487Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__getstate__, please add missing op schema 2025-12-04T10:13:51.3384729Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3385295Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__setstate__, please add missing op schema 2025-12-04T10:13:51.3385526Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3385706Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:13:51.3385801Z frames [('total', 1), ('ok', 1)] 2025-12-04T10:13:51.3385936Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:13:51.3386279Z aot_autograd [('total', 1), ('autograd_cache_miss', 1), ('autograd_cache_saved', 1), ('ok', 1)] 2025-12-04T10:13:51.3386708Z inductor [('triton_bundler_save_kernel', 8), ('fxgraph_cache_miss', 1), ('async_compile_cache_miss', 1), ('triton_bundler_save_static_autotuner', 1)] 2025-12-04T10:13:51.3386893Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:13:51.3387471Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__getstate__, please add missing op schema 2025-12-04T10:13:51.3387701Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3388274Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__setstate__, please add missing op schema 2025-12-04T10:13:51.3388504Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3388693Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:13:51.3388791Z frames [('total', 1), ('ok', 1)] 2025-12-04T10:13:51.3388926Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:13:51.3389183Z aot_autograd [('total', 1), ('autograd_cache_miss', 1), ('autograd_cache_saved', 1), ('ok', 1)] 2025-12-04T10:13:51.3389684Z inductor [('triton_bundler_save_kernel', 8), ('fxgraph_cache_miss', 1), ('async_compile_cache_miss', 1), ('triton_bundler_save_static_autotuner', 1)] 2025-12-04T10:13:51.3389869Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:13:51.3390439Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__getstate__, please add missing op schema 2025-12-04T10:13:51.3390671Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3391248Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/ops_handler.py:772: UserWarning: undefined OpHandler.__setstate__, please add missing op schema 2025-12-04T10:13:51.3391477Z warnings.warn(f"undefined OpHandler.{name}, please add missing op schema") 2025-12-04T10:13:51.3392129Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-9265b0e9898bf9b1.xml - 2025-12-04T10:13:51.3392286Z =========================== short test summary info ============================ 2025-12-04T10:13:51.3392854Z FAILED [0.2327s] inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_slice_cuda - AssertionError: 'def forward(self, arg0_1: "Sym(s77)", arg[333 chars]_9,)' != '' 2025-12-04T10:13:51.3393173Z - def forward(self, arg0_1: "Sym(s77)", arg1_1: "Sym(s27)", arg2_1: "Sym(s53)", arg3_1: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0"): 2025-12-04T10:13:51.3393436Z - add: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0" = torch.ops.aten.add.Tensor(arg3_1, 1); arg3_1 = None 2025-12-04T10:13:51.3393688Z - add_9: "f32[s77, s27, s53][s27*s53, s53, 1]cuda:0" = torch.ops.aten.add.Tensor(add, 1); add = None 2025-12-04T10:13:51.3394162Z - return (add_9,) : To accept the new output, re-run test with envvar EXPECTTEST_ACCEPT=1 (we recommend staging/committing your changes before doing this) 2025-12-04T10:13:51.3394167Z 2025-12-04T10:13:51.3394354Z To execute this test, run the following from the base repo dir: 2025-12-04T10:13:51.3394882Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 PYTORCH_TEST_WITH_SLOW_GRADCHECK=1 python test/inductor/test_compile_subprocess.py GPUTests.test_remove_noop_slice_cuda 2025-12-04T10:13:51.3394887Z 2025-12-04T10:13:51.3395109Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:13:51.3395266Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:13:51.3395453Z ======== 1 failed, 317 passed, 14 skipped, 2 rerun in 314.65s (0:05:14) ======== 2025-12-04T10:13:51.3395647Z Got exit code 1 2025-12-04T10:13:51.3395749Z Retrying single test... 2025-12-04T10:13:51.3396265Z Test results will be stored in test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-de1c2d4fca581e1f.xml 2025-12-04T10:13:51.3396414Z ============================= test session starts ============================== 2025-12-04T10:13:51.3396704Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:13:51.3396800Z cachedir: .pytest_cache 2025-12-04T10:13:51.3397226Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:13:51.3397336Z rootdir: /var/lib/jenkins/workspace 2025-12-04T10:13:51.3397431Z configfile: pytest.ini 2025-12-04T10:13:51.3397869Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:13:51.3398069Z collecting ... collected 897 items / 432 deselected / 465 selected 2025-12-04T10:13:51.3398548Z stepcurrent: skipping 331 already run items. Running only test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_slice_cuda 2025-12-04T10:13:51.3398650Z Running 1 items in this shard 2025-12-04T10:13:51.3398655Z 2025-12-04T10:13:51.3399112Z inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_slice_cuda <- test/inductor/test_torchinductor.py PASSED [9.1104s] [100%] 2025-12-04T10:13:51.3399196Z 2025-12-04T10:13:51.3399856Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-de1c2d4fca581e1f.xml - 2025-12-04T10:13:51.3400017Z ====================== 1 passed, 432 deselected in 9.18s ======================= 2025-12-04T10:13:51.3400111Z Got exit code 0 2025-12-04T10:13:51.3400318Z Test succeeded in new process, continuing with the rest of the tests 2025-12-04T10:13:51.3400837Z Test results will be stored in test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-e43b32b50674a33e.xml 2025-12-04T10:13:51.3400988Z ============================= test session starts ============================== 2025-12-04T10:13:51.3401271Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:13:51.3401380Z cachedir: .pytest_cache 2025-12-04T10:13:51.3401800Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:13:51.3401914Z rootdir: /var/lib/jenkins/workspace 2025-12-04T10:13:51.3402015Z configfile: pytest.ini 2025-12-04T10:13:51.3402445Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:13:51.3402646Z collecting ... collected 897 items / 332 deselected / 565 selected 2025-12-04T10:13:51.3402788Z stepcurrent: skipping 332 already run items. 2025-12-04T10:13:51.3402886Z Running 101 items in this shard 2025-12-04T10:13:51.3402891Z 2025-12-04T10:13:51.3404459Z inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_view_default_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0008s] (Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/151511 for platform(s) linux, rocm, slow. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests.) [ 0%] 2025-12-04T10:13:51.3406073Z inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_view_dtype_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0007s] (Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/151541 for platform(s) linux, rocm, slow. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests.) [ 1%] 2025-12-04T10:13:51.3406540Z inductor/test_compile_subprocess.py::GPUTests::test_repeat_as_strided_cuda <- test/inductor/test_torchinductor.py PASSED [9.0854s] [ 2%] 2025-12-04T10:13:51.3407382Z inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_2_cuda <- test/inductor/test_torchinductor.py W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3407757Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3408449Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3408750Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3409412Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3409877Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3410569Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3410890Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3411537Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3411985Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3412627Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3412989Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3413611Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3414025Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3414647Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3415063Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3415687Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3416129Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3416747Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3417277Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3417907Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3418399Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3419051Z W1204 10:11:40.727000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3419146Z PASSED [0.6480s] [ 3%] 2025-12-04T10:13:51.3420064Z inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_1_cuda <- test/inductor/test_torchinductor.py W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3420440Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3421122Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3421511Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3422156Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3422613Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3423293Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3423615Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3424270Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3424879Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3425522Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3425877Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3426494Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3426910Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3427532Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3427947Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3428684Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3429125Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3429738Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3430192Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3430827Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3431327Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3431976Z W1204 10:11:41.358000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3432064Z PASSED [0.5844s] [ 4%] 2025-12-04T10:13:51.3433118Z inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_2_cuda <- test/inductor/test_torchinductor.py W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3433485Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3434175Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3434479Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3435130Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3435590Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3436198Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3436525Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3437175Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3437618Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3438253Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3438612Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3439235Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3439727Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3440351Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3440764Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3441390Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3441826Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3442441Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3442894Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3443521Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3444094Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3444741Z W1204 10:11:41.938000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3444835Z PASSED [0.5452s] [ 5%] 2025-12-04T10:13:51.3445305Z inductor/test_compile_subprocess.py::GPUTests::test_repeat_interleave_cuda <- test/inductor/test_torchinductor.py PASSED [1.3251s] [ 6%] 2025-12-04T10:13:51.3445797Z inductor/test_compile_subprocess.py::GPUTests::test_reuse_buffers_with_aliasing_cuda <- test/inductor/test_torchinductor.py PASSED [5.2143s] [ 7%] 2025-12-04T10:13:51.3446216Z inductor/test_compile_subprocess.py::GPUTests::test_rsqrt_cuda <- test/inductor/test_torchinductor.py PASSED [0.8190s] [ 8%] 2025-12-04T10:13:51.3447061Z inductor/test_compile_subprocess.py::GPUTests::test_rsqrt_dynamic_shapes_cuda <- test/inductor/test_torchinductor.py W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3447433Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3448122Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3448425Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3449069Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3449532Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3450144Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3450544Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3451196Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3451637Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3452277Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3452631Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3453256Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3453673Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3454289Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3454857Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3455471Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3455908Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3456526Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3456981Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3457611Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3458101Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3458817Z W1204 10:11:49.932000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3459223Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3459584Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3460271Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3460575Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3461226Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3461764Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3462381Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3462705Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3463429Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3463874Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3464512Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3464867Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3465483Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3465978Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3466596Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3467007Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3467634Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3468070Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3468687Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3469135Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3469760Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3470259Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3470969Z W1204 10:11:51.143000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3471068Z PASSED [2.0778s] [ 9%] 2025-12-04T10:13:51.3471570Z inductor/test_compile_subprocess.py::GPUTests::test_scatter2_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (unstable on sm86) [ 10%] 2025-12-04T10:13:51.3472000Z inductor/test_compile_subprocess.py::GPUTests::test_scatter3_cuda <- test/inductor/test_torchinductor.py PASSED [0.9715s] [ 11%] 2025-12-04T10:13:51.3472423Z inductor/test_compile_subprocess.py::GPUTests::test_scatter6_cuda <- test/inductor/test_torchinductor.py PASSED [1.2150s] [ 12%] 2025-12-04T10:13:51.3473055Z inductor/test_compile_subprocess.py::GPUTests::test_scatter_add1_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Flaky test, needs debugging) [ 13%] 2025-12-04T10:13:51.3473499Z inductor/test_compile_subprocess.py::GPUTests::test_scatter_bf16_cuda <- test/inductor/test_torchinductor.py PASSED [0.6954s] [ 14%] 2025-12-04T10:13:51.3473947Z inductor/test_compile_subprocess.py::GPUTests::test_scatter_reduce1_cuda <- test/inductor/test_torchinductor.py PASSED [0.5095s] [ 15%] 2025-12-04T10:13:51.3474401Z inductor/test_compile_subprocess.py::GPUTests::test_scatter_reduce3_cuda <- test/inductor/test_torchinductor.py PASSED [0.8930s] [ 16%] 2025-12-04T10:13:51.3474944Z inductor/test_compile_subprocess.py::GPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_cuda <- test/inductor/test_torchinductor.py PASSED [0.7584s] [ 17%] 2025-12-04T10:13:51.3475392Z inductor/test_compile_subprocess.py::GPUTests::test_searchsorted_cuda <- test/inductor/test_torchinductor.py PASSED [11.5914s] [ 18%] 2025-12-04T10:13:51.3475847Z inductor/test_compile_subprocess.py::GPUTests::test_select_scatter_cuda <- test/inductor/test_torchinductor.py PASSED [0.6375s] [ 19%] 2025-12-04T10:13:51.3476332Z inductor/test_compile_subprocess.py::GPUTests::test_setitem_with_int_parameter_cuda <- test/inductor/test_torchinductor.py PASSED [0.5781s] [ 20%] 2025-12-04T10:13:51.3477120Z inductor/test_compile_subprocess.py::GPUTests::test_sgn_cuda <- test/inductor/test_torchinductor.py W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3477596Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3478285Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3478594Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3479240Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3479700Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3480313Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3480641Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3481288Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3481727Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3482373Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3482729Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3483354Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3483767Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3484469Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3484882Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3485503Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3485942Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3486554Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3487008Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3487630Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3488201Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3488916Z W1204 10:12:09.996000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3489005Z PASSED [0.3941s] [ 21%] 2025-12-04T10:13:51.3489822Z inductor/test_compile_subprocess.py::GPUTests::test_sgn_extremal_cuda <- test/inductor/test_torchinductor.py W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3490190Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3490876Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3491184Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3491829Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3492292Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3492898Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3493223Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3493873Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3494315Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3495033Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3495388Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3496007Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3496424Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3497045Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3497454Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3498076Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3498512Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3499201Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3499656Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3500278Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3500779Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3501490Z W1204 10:12:10.341000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3501585Z PASSED [0.3215s] [ 22%] 2025-12-04T10:13:51.3502034Z inductor/test_compile_subprocess.py::GPUTests::test_shape_padding_cuda <- test/inductor/test_torchinductor.py PASSED [2.5218s] [ 23%] 2025-12-04T10:13:51.3502812Z inductor/test_compile_subprocess.py::GPUTests::test_silu_cuda <- test/inductor/test_torchinductor.py W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3503237Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3503922Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3504232Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3504883Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3505340Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3506033Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3506354Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3507007Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3507449Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3508088Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3508439Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3509059Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3509476Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3510092Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3510585Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3511200Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3511644Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3512254Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3512704Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3513338Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3513824Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3514541Z W1204 10:12:13.201000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3514630Z PASSED [0.3438s] [ 24%] 2025-12-04T10:13:51.3515160Z inductor/test_compile_subprocess.py::GPUTests::test_size_asserts_for_multi_output_fallback_cuda <- test/inductor/test_torchinductor.py PASSED [0.1292s] [ 25%] 2025-12-04T10:13:51.3515586Z inductor/test_compile_subprocess.py::GPUTests::test_slice2_cuda <- test/inductor/test_torchinductor.py PASSED [0.4996s] [ 26%] 2025-12-04T10:13:51.3515998Z inductor/test_compile_subprocess.py::GPUTests::test_slice4_cuda <- test/inductor/test_torchinductor.py PASSED [0.2464s] [ 27%] 2025-12-04T10:13:51.3516451Z inductor/test_compile_subprocess.py::GPUTests::test_slice_mutation3_cuda <- test/inductor/test_torchinductor.py PASSED [0.1871s] [ 28%] 2025-12-04T10:13:51.3516994Z inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter3_cuda <- test/inductor/test_torchinductor.py PASSED [0.4034s] [ 29%] 2025-12-04T10:13:51.3517444Z inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter4_cuda <- test/inductor/test_torchinductor.py PASSED [0.4009s] [ 30%] 2025-12-04T10:13:51.3517879Z inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter_cuda <- test/inductor/test_torchinductor.py PASSED [0.6808s] [ 31%] 2025-12-04T10:13:51.3518388Z inductor/test_compile_subprocess.py::GPUTests::test_slice_scatter_dtype_consistency_cuda <- test/inductor/test_torchinductor.py PASSED [0.7860s] [ 32%] 2025-12-04T10:13:51.3518884Z inductor/test_compile_subprocess.py::GPUTests::test_slice_view_with_graph_break_cuda <- test/inductor/test_torchinductor.py PASSED [0.3959s] [ 33%] 2025-12-04T10:13:51.3519726Z inductor/test_compile_subprocess.py::GPUTests::test_softmax_backward_data_cuda <- test/inductor/test_torchinductor.py W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3520101Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3520788Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3521094Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3521824Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3522281Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3522896Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3523217Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3523864Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3524482Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3525236Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3525599Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3526223Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3526638Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3527261Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3527676Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3528296Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3528865Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3529482Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3529936Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3530563Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3531054Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3531709Z W1204 10:12:17.137000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.fma.default 2025-12-04T10:13:51.3532112Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3532581Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3533269Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3533570Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3534224Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3534680Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3535283Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3535613Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3536259Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3536706Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3537340Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3537693Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3538313Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3538724Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3539428Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3539841Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3540465Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3540904Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3541522Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3541970Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3542597Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3543144Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3544082Z W1204 10:12:17.487000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3544176Z PASSED [0.7246s] [ 34%] 2025-12-04T10:13:51.3544739Z inductor/test_compile_subprocess.py::GPUTests::test_softmax_one_kernel_loop_cuda <- test/inductor/test_torchinductor.py PASSED [0.3543s] [ 35%] 2025-12-04T10:13:51.3545705Z inductor/test_compile_subprocess.py::GPUTests::test_sort_bool_cuda <- test/inductor/test_torchinductor.py W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3546131Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3546952Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3547305Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3548079Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3548621Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3549346Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3549715Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3550496Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3551005Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3551853Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3552206Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3552826Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3553241Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3553855Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3554270Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3554889Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3555328Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3555936Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3556470Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3557095Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3557590Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3558307Z W1204 10:12:18.198000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3558712Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3559079Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3559764Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3560078Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3560722Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3561178Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3561794Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3562115Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3562861Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3563299Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3563934Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3564294Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3564910Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3565329Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3565952Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3566364Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3567059Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3567497Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3568113Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3568568Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3569196Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3569689Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3570404Z W1204 10:12:24.909000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3570496Z PASSED [13.3723s] [ 36%] 2025-12-04T10:13:51.3570907Z inductor/test_compile_subprocess.py::GPUTests::test_sort_cuda <- test/inductor/test_torchinductor.py PASSED [1.6737s] [ 37%] 2025-12-04T10:13:51.3571568Z inductor/test_compile_subprocess.py::GPUTests::test_sort_stable_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0005s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 38%] 2025-12-04T10:13:51.3572019Z inductor/test_compile_subprocess.py::GPUTests::test_sort_transpose_cuda <- test/inductor/test_torchinductor.py PASSED [25.2715s] [ 39%] 2025-12-04T10:13:51.3572488Z inductor/test_compile_subprocess.py::GPUTests::test_special_polygamma_cuda <- test/inductor/test_torchinductor.py PASSED [0.6243s] [ 40%] 2025-12-04T10:13:51.3572927Z inductor/test_compile_subprocess.py::GPUTests::test_split_cumprod_cuda <- test/inductor/test_torchinductor.py PASSED [1.2046s] [ 41%] 2025-12-04T10:13:51.3573392Z inductor/test_compile_subprocess.py::GPUTests::test_split_cumsum_low_prec_cuda <- test/inductor/test_torchinductor.py PASSED [0.2786s] [ 42%] 2025-12-04T10:13:51.3574369Z inductor/test_compile_subprocess.py::GPUTests::test_split_failed_cuda <- test/inductor/test_torchinductor.py E1204 10:13:00.587000 70704 site-packages/torch/_subclasses/fake_tensor.py:2827] [0/0] failed while attempting to run meta for aten.split_with_sizes.default 2025-12-04T10:13:51.3574741Z E1204 10:13:00.587000 70704 site-packages/torch/_subclasses/fake_tensor.py:2827] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3575431Z E1204 10:13:00.587000 70704 site-packages/torch/_subclasses/fake_tensor.py:2827] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 2823, in _dispatch_impl 2025-12-04T10:13:51.3575779Z E1204 10:13:00.587000 70704 site-packages/torch/_subclasses/fake_tensor.py:2827] [0/0] r = func(*args, **kwargs) 2025-12-04T10:13:51.3576357Z E1204 10:13:00.587000 70704 site-packages/torch/_subclasses/fake_tensor.py:2827] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 836, in __call__ 2025-12-04T10:13:51.3576729Z E1204 10:13:00.587000 70704 site-packages/torch/_subclasses/fake_tensor.py:2827] [0/0] return self._op(*args, **kwargs) 2025-12-04T10:13:51.3577371Z E1204 10:13:00.587000 70704 site-packages/torch/_subclasses/fake_tensor.py:2827] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_refs/__init__.py", line 4359, in split_with_sizes 2025-12-04T10:13:51.3577697Z E1204 10:13:00.587000 70704 site-packages/torch/_subclasses/fake_tensor.py:2827] [0/0] torch._check_with( 2025-12-04T10:13:51.3578371Z E1204 10:13:00.587000 70704 site-packages/torch/_subclasses/fake_tensor.py:2827] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py", line 1716, in _check_with 2025-12-04T10:13:51.3578752Z E1204 10:13:00.587000 70704 site-packages/torch/_subclasses/fake_tensor.py:2827] [0/0] raise error_type(message_evaluated) 2025-12-04T10:13:51.3579224Z E1204 10:13:00.587000 70704 site-packages/torch/_subclasses/fake_tensor.py:2827] [0/0] ValueError: Split sizes add up to 4 but got the tensor's size of 5 2025-12-04T10:13:51.3579317Z PASSED [0.0165s] [ 43%] 2025-12-04T10:13:51.3579767Z inductor/test_compile_subprocess.py::GPUTests::test_split_with_list_cuda <- test/inductor/test_torchinductor.py PASSED [1.4414s] [ 44%] 2025-12-04T10:13:51.3580668Z inductor/test_compile_subprocess.py::GPUTests::test_split_with_sizes_with_unbacked_symints_cuda <- test/inductor/test_torchinductor.py W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3581043Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] Traceback (most recent call last): 2025-12-04T10:13:51.3581725Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3582028Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] ).serialize() 2025-12-04T10:13:51.3582679Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3583209Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3583831Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3584152Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] pickler.dump(obj) 2025-12-04T10:13:51.3584801Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3585321Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3585962Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3586316Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3586933Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3587348Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3587971Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3588387Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3589003Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3589522Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3590138Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3590597Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3591222Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3591711Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3592371Z W1204 10:13:02.360000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [1/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3592460Z PASSED [0.8834s] [ 45%] 2025-12-04T10:13:51.3593236Z inductor/test_compile_subprocess.py::GPUTests::test_std_cuda <- test/inductor/test_torchinductor.py W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3593607Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3594288Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3594601Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3595244Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3595702Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3596387Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3596708Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3597356Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3597799Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3598440Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3598795Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3599411Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3599827Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3600543Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3600959Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3601583Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3602022Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3602634Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3603088Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3603714Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3604206Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3604921Z W1204 10:13:03.801000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3605009Z PASSED [1.6048s] [ 46%] 2025-12-04T10:13:51.3605591Z inductor/test_compile_subprocess.py::GPUTests::test_stride_preservation_with_stride_modifying_fx_pass_cuda <- test/inductor/test_torchinductor.py PASSED [0.1448s] [ 47%] 2025-12-04T10:13:51.3606041Z inductor/test_compile_subprocess.py::GPUTests::test_strided_inputs_cuda <- test/inductor/test_torchinductor.py PASSED [0.1906s] [ 48%] 2025-12-04T10:13:51.3606447Z inductor/test_compile_subprocess.py::GPUTests::test_sum1_cuda <- test/inductor/test_torchinductor.py PASSED [0.5369s] [ 49%] 2025-12-04T10:13:51.3606854Z inductor/test_compile_subprocess.py::GPUTests::test_sum3_cuda <- test/inductor/test_torchinductor.py PASSED [0.6310s] [ 50%] 2025-12-04T10:13:51.3607355Z inductor/test_compile_subprocess.py::GPUTests::test_sum_dtype_cuda <- test/inductor/test_torchinductor.py PASSED [0.7468s] [ 51%] 2025-12-04T10:13:51.3607771Z inductor/test_compile_subprocess.py::GPUTests::test_sum_int_cuda <- test/inductor/test_torchinductor.py PASSED [0.6021s] [ 52%] 2025-12-04T10:13:51.3608205Z inductor/test_compile_subprocess.py::GPUTests::test_sum_keepdims_cuda <- test/inductor/test_torchinductor.py PASSED [0.5332s] [ 53%] 2025-12-04T10:13:51.3608610Z inductor/test_compile_subprocess.py::GPUTests::test_tanh_cuda <- test/inductor/test_torchinductor.py PASSED [0.6833s] [ 54%] 2025-12-04T10:13:51.3609030Z inductor/test_compile_subprocess.py::GPUTests::test_tensor1_cuda <- test/inductor/test_torchinductor.py PASSED [0.5842s] [ 55%] 2025-12-04T10:13:51.3609443Z inductor/test_compile_subprocess.py::GPUTests::test_tensor2_cuda <- test/inductor/test_torchinductor.py PASSED [0.3459s] [ 56%] 2025-12-04T10:13:51.3609869Z inductor/test_compile_subprocess.py::GPUTests::test_tensor3_cuda <- test/inductor/test_torchinductor.py PASSED [0.4698s] [ 57%] 2025-12-04T10:13:51.3610334Z inductor/test_compile_subprocess.py::GPUTests::test_tensor_index_put_slice_cuda <- test/inductor/test_torchinductor.py PASSED [2.8146s] [ 58%] 2025-12-04T10:13:51.3611235Z inductor/test_compile_subprocess.py::GPUTests::test_tmp_not_defined_issue1_use_block_ptr_True_cuda <- test/inductor/test_torchinductor.py W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3611682Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3612365Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3612677Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3613323Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3613782Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3614395Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3614714Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3615363Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3615797Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3616438Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3616792Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3617411Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3617825Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3618525Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3622678Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3623399Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3623851Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3624622Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3625086Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3625707Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3626349Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3627062Z W1204 10:13:13.344000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3627154Z PASSED [0.8414s] [ 59%] 2025-12-04T10:13:51.3627643Z inductor/test_compile_subprocess.py::GPUTests::test_tmp_not_defined_issue2_cuda <- test/inductor/test_torchinductor.py PASSED [0.7274s] [ 60%] 2025-12-04T10:13:51.3628485Z inductor/test_compile_subprocess.py::GPUTests::test_tmp_not_defined_issue3_cuda <- test/inductor/test_torchinductor.py W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3628855Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3629542Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3629842Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3630497Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3630953Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3631563Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3631885Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3632529Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3632970Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3633710Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3634069Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3634681Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3635100Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3635712Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3636126Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3636745Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3637178Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3637953Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3638400Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3639031Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3639517Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3640159Z W1204 10:13:14.636000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3640257Z PASSED [2.8243s] [ 61%] 2025-12-04T10:13:51.3641083Z inductor/test_compile_subprocess.py::GPUTests::test_to_device_constant_cuda <- test/inductor/test_torchinductor.py W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3641458Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3642137Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3642438Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3643093Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3643550Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3644158Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3644555Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3645202Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3645641Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3646275Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3646631Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3647248Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3647662Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3648276Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3648795Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3649408Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3649851Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3650462Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3650910Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3651540Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3652027Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3652685Z W1204 10:13:17.299000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3653083Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3653446Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3654135Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3654436Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3655161Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3655614Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3656221Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3656545Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3657186Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3657625Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3658261Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3658617Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3659231Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3659725Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3660347Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3660762Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3661379Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3661814Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3662434Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3662882Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3663580Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3664073Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3664715Z W1204 10:13:17.484000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3664813Z PASSED [0.3725s] [ 62%] 2025-12-04T10:13:51.3665269Z inductor/test_compile_subprocess.py::GPUTests::test_to_memory_format_cuda <- test/inductor/test_torchinductor.py PASSED [0.7819s] [ 63%] 2025-12-04T10:13:51.3665676Z inductor/test_compile_subprocess.py::GPUTests::test_topk_cuda <- test/inductor/test_torchinductor.py PASSED [0.2284s] [ 64%] 2025-12-04T10:13:51.3666477Z inductor/test_compile_subprocess.py::GPUTests::test_transpose_add_cuda <- test/inductor/test_torchinductor.py PASSED [0.3829s] [ 65%] 2025-12-04T10:13:51.3666959Z inductor/test_compile_subprocess.py::GPUTests::test_transposed_propagates_cuda <- test/inductor/test_torchinductor.py PASSED [0.1755s] [ 66%] 2025-12-04T10:13:51.3667516Z inductor/test_compile_subprocess.py::GPUTests::test_triton_argmin_argmax_transpose_logical_index_cuda <- test/inductor/test_torchinductor.py PASSED [3.9827s] [ 67%] 2025-12-04T10:13:51.3668366Z inductor/test_compile_subprocess.py::GPUTests::test_triton_kernel_bool_param_cuda <- test/inductor/test_torchinductor.py W1204 10:13:23.472000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3668736Z W1204 10:13:23.472000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3669424Z W1204 10:13:23.472000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3669727Z W1204 10:13:23.472000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3670372Z W1204 10:13:23.472000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3670903Z W1204 10:13:23.472000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3671513Z W1204 10:13:23.472000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3671832Z W1204 10:13:23.472000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3672448Z W1204 10:13:23.472000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] AttributeError: Can't pickle local object 'CommonTemplate.test_triton_kernel_bool_param..Model' 2025-12-04T10:13:51.3672846Z W1204 10:13:23.903000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3673205Z W1204 10:13:23.903000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3673890Z W1204 10:13:23.903000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3674189Z W1204 10:13:23.903000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3674835Z W1204 10:13:23.903000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3675291Z W1204 10:13:23.903000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3675892Z W1204 10:13:23.903000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3676221Z W1204 10:13:23.903000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3676824Z W1204 10:13:23.903000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] AttributeError: Can't pickle local object 'CommonTemplate.test_triton_kernel_bool_param..Model' 2025-12-04T10:13:51.3676920Z PASSED [0.8703s] [ 68%] 2025-12-04T10:13:51.3677777Z inductor/test_compile_subprocess.py::GPUTests::test_triu_cuda <- test/inductor/test_torchinductor.py W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3678143Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3678823Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3679123Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3679770Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3680230Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3680841Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3681160Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3681881Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3682321Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3682959Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3683315Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3683930Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3684346Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3684962Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3685370Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3685990Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3686424Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3687037Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3687488Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3688111Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3688705Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3689351Z W1204 10:13:24.115000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3689758Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3690120Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3690800Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3691105Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3691750Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3692202Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3692881Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3693203Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3693847Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3694288Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3694920Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3695276Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3695890Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3696309Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3696924Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3697336Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3697952Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3698389Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3698997Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3699524Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3700150Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3700641Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3701289Z W1204 10:13:24.435000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3701377Z PASSED [0.6394s] [ 69%] 2025-12-04T10:13:51.3702202Z inductor/test_compile_subprocess.py::GPUTests::test_uint4x2_mixed_mm_cuda <- test/inductor/test_torchinductor.py W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3702566Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3703322Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3703709Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3704350Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3704813Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3705416Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3705737Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3706383Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3706817Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3707460Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3707809Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3708429Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3708844Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3709456Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3709871Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3710556Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3710996Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3711603Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3712056Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3712680Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3713170Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3713879Z W1204 10:13:24.756000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3714352Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3714719Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3715397Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3715708Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3716348Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3716802Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3717414Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3717730Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3718378Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3718813Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3719447Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3719802Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3720415Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3720829Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3721518Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3721930Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3722542Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3722987Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3723592Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3724043Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3724834Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3725454Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3726168Z W1204 10:13:25.113000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3726258Z PASSED [0.6396s] [ 70%] 2025-12-04T10:13:51.3727093Z inductor/test_compile_subprocess.py::GPUTests::test_unbacked_float_item_cuda <- test/inductor/test_torchinductor.py W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3727466Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3728145Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3728454Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3729096Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3729561Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3730166Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3730483Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3731129Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3731565Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3732333Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3732685Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3733304Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3733723Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3734336Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3734752Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3735369Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3735804Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3736412Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3736943Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3737564Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3738061Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3738772Z W1204 10:13:25.661000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3738867Z PASSED [0.5501s] [ 71%] 2025-12-04T10:13:51.3739367Z inductor/test_compile_subprocess.py::GPUTests::test_unbacked_floordiv_simplify_cuda <- test/inductor/test_torchinductor.py PASSED [0.8400s] [ 72%] 2025-12-04T10:13:51.3739885Z inductor/test_compile_subprocess.py::GPUTests::test_unbacked_floordiv_simplify_errors_cuda <- test/inductor/test_torchinductor.py PASSED [0.0184s] [ 73%] 2025-12-04T10:13:51.3740359Z inductor/test_compile_subprocess.py::GPUTests::test_unroll_small_reduction_cuda <- test/inductor/test_torchinductor.py PASSED [2.0299s] [ 74%] 2025-12-04T10:13:51.3740833Z inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_float32_cuda <- test/inductor/test_torchinductor.py PASSED [0.8660s] [ 75%] 2025-12-04T10:13:51.3741298Z inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_float64_cuda <- test/inductor/test_torchinductor.py PASSED [0.5526s] [ 76%] 2025-12-04T10:13:51.3741759Z inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int16_cuda <- test/inductor/test_torchinductor.py PASSED [0.4515s] [ 77%] 2025-12-04T10:13:51.3742218Z inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int32_cuda <- test/inductor/test_torchinductor.py PASSED [0.4529s] [ 78%] 2025-12-04T10:13:51.3742672Z inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int64_cuda <- test/inductor/test_torchinductor.py PASSED [0.4574s] [ 79%] 2025-12-04T10:13:51.3743208Z inductor/test_compile_subprocess.py::GPUTests::test_unspec_inputs_int8_cuda <- test/inductor/test_torchinductor.py PASSED [0.4504s] [ 80%] 2025-12-04T10:13:51.3743740Z inductor/test_compile_subprocess.py::GPUTests::test_unsqueeze_inplace_cuda <- test/inductor/test_torchinductor.py PASSED [0.4083s] [ 81%] 2025-12-04T10:13:51.3744588Z inductor/test_compile_subprocess.py::GPUTests::test_upsample_bilinear2d_a_cuda <- test/inductor/test_torchinductor.py W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3744957Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3745646Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3745946Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3746592Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3747049Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3747652Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3748053Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3748696Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3749140Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3749773Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3750122Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3750746Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3751155Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3751776Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3752183Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3752801Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3753238Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3753844Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3754293Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3754992Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3755485Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3756133Z W1204 10:13:32.633000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3756530Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3756901Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3757582Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3757887Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3758528Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3759062Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3759666Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3759989Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3760635Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3761069Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3761712Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3762065Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3762686Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3763094Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3763708Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3764128Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3764744Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3765256Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3765867Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3766321Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3766945Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3767431Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3768147Z W1204 10:13:33.700000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3768235Z PASSED [2.1595s] [ 82%] 2025-12-04T10:13:51.3768848Z inductor/test_compile_subprocess.py::GPUTests::test_upsample_cat_conv_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (only support cpu upsample_cat_conv test) [ 83%] 2025-12-04T10:13:51.3769680Z inductor/test_compile_subprocess.py::GPUTests::test_upsample_nearest1d_cuda <- test/inductor/test_torchinductor.py W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3770147Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3770833Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3771137Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3771787Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3772241Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3772852Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3773169Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3773813Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3774253Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3774885Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3775246Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3775866Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3776357Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3776975Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3777383Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3778006Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3778444Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3779058Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3779508Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3780132Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3780699Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3781345Z W1204 10:13:34.696000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.iota.default 2025-12-04T10:13:51.3781747Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3782113Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3782793Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3783147Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3783789Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3784245Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3784857Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3785179Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3785816Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3786262Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3786894Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3787326Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3787946Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3788354Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3788976Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3789387Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3790008Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3790442Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3791047Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3791577Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3792196Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3792691Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3793399Z W1204 10:13:35.214000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3793491Z PASSED [0.9930s] [ 84%] 2025-12-04T10:13:51.3793939Z inductor/test_compile_subprocess.py::GPUTests::test_var_correction_cuda <- test/inductor/test_torchinductor.py PASSED [1.1277s] [ 85%] 2025-12-04T10:13:51.3794441Z inductor/test_compile_subprocess.py::GPUTests::test_var_mean_tile_reduction_False_cuda <- test/inductor/test_torchinductor.py PASSED [0.7047s] [ 86%] 2025-12-04T10:13:51.3794869Z inductor/test_compile_subprocess.py::GPUTests::test_vdd_clamp_cuda <- test/inductor/test_torchinductor.py PASSED [0.3820s] [ 87%] 2025-12-04T10:13:51.3795312Z inductor/test_compile_subprocess.py::GPUTests::test_view_as_complex_cuda <- test/inductor/test_torchinductor.py PASSED [0.2389s] [ 88%] 2025-12-04T10:13:51.3795745Z inductor/test_compile_subprocess.py::GPUTests::test_view_as_real_cuda <- test/inductor/test_torchinductor.py PASSED [0.2205s] [ 89%] 2025-12-04T10:13:51.3796174Z inductor/test_compile_subprocess.py::GPUTests::test_view_detach_cuda <- test/inductor/test_torchinductor.py PASSED [0.2133s] [ 90%] 2025-12-04T10:13:51.3796588Z inductor/test_compile_subprocess.py::GPUTests::test_views2_cuda <- test/inductor/test_torchinductor.py PASSED [2.1635s] [ 91%] 2025-12-04T10:13:51.3797007Z inductor/test_compile_subprocess.py::GPUTests::test_views4_cuda <- test/inductor/test_torchinductor.py PASSED [1.6295s] [ 92%] 2025-12-04T10:13:51.3797417Z inductor/test_compile_subprocess.py::GPUTests::test_views5_cuda <- test/inductor/test_torchinductor.py PASSED [0.2262s] [ 93%] 2025-12-04T10:13:51.3797831Z inductor/test_compile_subprocess.py::GPUTests::test_views6_cuda <- test/inductor/test_torchinductor.py PASSED [0.3902s] [ 94%] 2025-12-04T10:13:51.3798730Z inductor/test_compile_subprocess.py::GPUTests::test_weight_norm_bwd_cuda <- test/inductor/test_torchinductor.py W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3799105Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] Traceback (most recent call last): 2025-12-04T10:13:51.3799794Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3800108Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] ).serialize() 2025-12-04T10:13:51.3800757Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3801219Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3801833Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3802159Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] pickler.dump(obj) 2025-12-04T10:13:51.3802881Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3803330Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3803972Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3804330Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] cls(obj, pickler.options), 2025-12-04T10:13:51.3804948Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3805371Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3805987Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3806401Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3807022Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3807461Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3808076Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3808526Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3809313Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3809807Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3810520Z W1204 10:13:43.410000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3810927Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3811296Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] Traceback (most recent call last): 2025-12-04T10:13:51.3811987Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3812297Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] ).serialize() 2025-12-04T10:13:51.3812950Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3813509Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3814120Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3814448Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] pickler.dump(obj) 2025-12-04T10:13:51.3815099Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3815537Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3816174Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3816535Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] cls(obj, pickler.options), 2025-12-04T10:13:51.3817149Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3817572Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3818188Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3818604Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3819222Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3819660Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3820351Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3820801Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3821431Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3821925Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3822639Z W1204 10:13:43.753000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0_1] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3822729Z PASSED [1.1509s] [ 95%] 2025-12-04T10:13:51.3823309Z inductor/test_compile_subprocess.py::GPUTests::test_weight_norm_conv2d_cuda <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 96%] 2025-12-04T10:13:51.3824129Z inductor/test_compile_subprocess.py::GPUTests::test_where_broadcast_cuda <- test/inductor/test_torchinductor.py W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3824695Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] Traceback (most recent call last): 2025-12-04T10:13:51.3825375Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3825659Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] ).serialize() 2025-12-04T10:13:51.3826300Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3826749Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3827349Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3827657Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] pickler.dump(obj) 2025-12-04T10:13:51.3828291Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3828724Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3829348Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3829682Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] cls(obj, pickler.options), 2025-12-04T10:13:51.3830296Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3830697Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3831310Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3831828Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3832441Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3832869Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3833468Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3833910Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3834531Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3835024Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3835725Z W1204 10:13:44.537000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3835923Z PASSED [1.1285s] [ 97%] 2025-12-04T10:13:51.3836394Z inductor/test_compile_subprocess.py::GPUTests::test_where_with_logical_op_cuda <- test/inductor/test_torchinductor.py PASSED [0.7532s] [ 98%] 2025-12-04T10:13:51.3836855Z inductor/test_compile_subprocess.py::GPUTests::test_zero_dim_reductions_cuda <- test/inductor/test_torchinductor.py PASSED [0.3192s] [ 99%] 2025-12-04T10:13:51.3837713Z inductor/test_compile_subprocess.py::GPUTests::test_zero_element_mutation_cuda <- test/inductor/test_torchinductor.py W1204 10:13:46.291000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3838079Z W1204 10:13:46.291000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3838772Z W1204 10:13:46.291000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3839072Z W1204 10:13:46.291000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3839717Z W1204 10:13:46.291000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3840185Z W1204 10:13:46.291000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3840791Z W1204 10:13:46.291000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3841121Z W1204 10:13:46.291000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3841738Z W1204 10:13:46.291000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] AttributeError: Can't pickle local object 'CommonTemplate.test_zero_element_mutation..CustomModel' 2025-12-04T10:13:51.3842134Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Unable to pickle input graph or example inputs 2025-12-04T10:13:51.3842578Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] Traceback (most recent call last): 2025-12-04T10:13:51.3843261Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 486, in serialize_compile 2025-12-04T10:13:51.3843567Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] ).serialize() 2025-12-04T10:13:51.3844213Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx_ext.py", line 210, in serialize 2025-12-04T10:13:51.3844670Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _WireProtocolPickledInput(GraphPickler.dumps(self)) 2025-12-04T10:13:51.3845279Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 127, in dumps 2025-12-04T10:13:51.3845596Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] pickler.dump(obj) 2025-12-04T10:13:51.3846242Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 94, in reducer_override 2025-12-04T10:13:51.3846755Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return _GraphModulePickleData.reduce_helper(self, obj) 2025-12-04T10:13:51.3847391Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 352, in reduce_helper 2025-12-04T10:13:51.3847740Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] cls(obj, pickler.options), 2025-12-04T10:13:51.3848361Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 364, in __init__ 2025-12-04T10:13:51.3848770Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.graph = _GraphPickleData(gm._graph, options) 2025-12-04T10:13:51.3849382Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 571, in __init__ 2025-12-04T10:13:51.3849800Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] nodes[node] = _NodePickleData(node, nodes, options) 2025-12-04T10:13:51.3850411Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 387, in __init__ 2025-12-04T10:13:51.3850853Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] self.target = _OpPickleData.pickle(node.target, options) 2025-12-04T10:13:51.3851464Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 435, in pickle 2025-12-04T10:13:51.3851917Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] return cls._pickle_op(name, _OpOverloadPickleData, options) 2025-12-04T10:13:51.3852542Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/_graph_pickler.py", line 456, in _pickle_op 2025-12-04T10:13:51.3853027Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] raise BypassFxGraphCache(f"Unable to pickle non-standard op: {name}") 2025-12-04T10:13:51.3853841Z W1204 10:13:46.395000 70704 site-packages/torch/_inductor/compile_fx_ext.py:493] [0/0] torch._inductor.codecache.BypassFxGraphCache: Unable to pickle non-standard op: torch.ops.prims.convert_element_type.default 2025-12-04T10:13:51.3853933Z PASSED [0.2156s] [100%] 2025-12-04T10:13:51.3853938Z 2025-12-04T10:13:51.3854591Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-e43b32b50674a33e.xml - 2025-12-04T10:13:51.3854791Z ========== 94 passed, 7 skipped, 332 deselected in 134.97s (0:02:14) =========== 2025-12-04T10:13:51.3855303Z The following tests failed and then succeeded when run in a new process['test/inductor/test_compile_subprocess.py::GPUTests::test_remove_noop_slice_cuda'] 2025-12-04T10:13:51.3855312Z 2025-12-04T10:13:51.3855808Z FINISHED PRINTING LOG FILE of inductor/test_compile_subprocess 1/2 (test/test-reports/inductor.test_compile_subprocess_1.2_f3ccfd39c19f4fed_.log) 2025-12-04T10:13:51.3855813Z 2025-12-04T10:13:51.3856129Z Finished inductor/test_compile_subprocess 1/2 ... [2025-12-04 10:13:51.070468][3709.968025693], took 8.22min 2025-12-04T10:13:51.3856825Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-9265b0e9898bf9b1.xml 2025-12-04T10:13:51.3857542Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-de1c2d4fca581e1f.xml 2025-12-04T10:13:51.3858354Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-e43b32b50674a33e.xml 2025-12-04T10:13:51.3858612Z Running inductor/test_deterministic 1/3 ... [2025-12-04 10:13:51.274209][3710.171775086] 2025-12-04T10:13:51.3858723Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T10:13:51.3859689Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_deterministic.py', '--shard-id=1', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:13:51.274603] 2025-12-04T10:17:20.7831768Z 2025-12-04T10:17:20.7832734Z inductor/test_deterministic 1/3 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_deterministic_1.3_01f8a4a4771bae7d_.log 2025-12-04T10:17:20.7841682Z Running 15 items in this shard: test/inductor/test_deterministic.py::DeterministicTest::test_max_autotune_deterministic_True, test/inductor/test_deterministic.py::DeterministicTest::test_mm_padding_deterministic_False, test/inductor/test_deterministic.py::DeterministicTest::test_mm_padding_deterministic_True, test/inductor/test_deterministic.py::DeterministicTest::test_reduction_coordesc_tuning_deterministic_False, test/inductor/test_deterministic.py::DeterministicTest::test_reduction_coordesc_tuning_deterministic_True, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_BertForMaskedLM_training_or_inference_inference_precision_amp, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_BertForMaskedLM_training_or_inference_inference_precision_float16, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_BertForMaskedLM_training_or_inference_training_precision_amp, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_BertForMaskedLM_training_or_inference_training_precision_bfloat16, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_BertForMaskedLM_training_or_inference_training_precision_float32, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_DistillGPT2_training_or_inference_inference_precision_amp, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_DistillGPT2_training_or_inference_inference_precision_float16, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_DistillGPT2_training_or_inference_inference_precision_float32, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_GoogleFnet_training_or_inference_inference_precision_bfloat16, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_GoogleFnet_training_or_inference_training_precision_float32 2025-12-04T10:17:20.7849927Z 2025-12-04T10:17:20.7850239Z Finished inductor/test_deterministic 1/3 ... [2025-12-04 10:17:20.782809][3919.680377837], took 3.49min 2025-12-04T10:17:20.7851314Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_deterministic/inductor.test_deterministic-c7eee0aee9f5b9d8.xml 2025-12-04T10:17:20.8639481Z Running dynamo/test_sets 1/1 ... [2025-12-04 10:17:20.863565][3919.76113395] 2025-12-04T10:17:20.8639932Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T10:17:20.8642919Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_sets.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:17:20.863921] 2025-12-04T10:17:32.2528380Z 2025-12-04T10:17:32.2529675Z dynamo/test_sets 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_sets_1.1_015994dafeaf086f_.log 2025-12-04T10:17:32.2560112Z Running 124 items in this shard: test/dynamo/test_sets.py::CustomSetTests::test_custom_add, test/dynamo/test_sets.py::CustomSetTests::test_custom_contains, test/dynamo/test_sets.py::MiscTests::test_isdisjoint_with_generator, test/dynamo/test_sets.py::TestSetGuards::test_in_guard, test/dynamo/test_sets.py::TestSetGuards::test_set_guard_on_keys_change, test/dynamo/test_sets.py::TestSetGuards::test_set_multiple_types, test/dynamo/test_sets.py::TestSetGuards::test_set_recompile_on_key_change, test/dynamo/test_sets.py::TestSetGuards::test_set_recompile_on_key_pop, test/dynamo/test_sets.py::TestSetGuards::test_set_with_function, test/dynamo/test_sets.py::TestSetGuards::test_set_with_tensors, test/dynamo/test_sets.py::FrozensetTests::test_binop_and, test/dynamo/test_sets.py::FrozensetTests::test_binop_or, test/dynamo/test_sets.py::FrozensetTests::test_binop_sub, test/dynamo/test_sets.py::FrozensetTests::test_binop_xor, test/dynamo/test_sets.py::FrozensetTests::test_cmp_eq, test/dynamo/test_sets.py::FrozensetTests::test_cmp_greater_than, test/dynamo/test_sets.py::FrozensetTests::test_cmp_greater_than_or_equal, test/dynamo/test_sets.py::FrozensetTests::test_cmp_less_than, test/dynamo/test_sets.py::FrozensetTests::test_cmp_less_than_or_equal, test/dynamo/test_sets.py::FrozensetTests::test_cmp_ne, test/dynamo/test_sets.py::FrozensetTests::test_constructor_iterable, test/dynamo/test_sets.py::FrozensetTests::test_contains, test/dynamo/test_sets.py::FrozensetTests::test_copy, test/dynamo/test_sets.py::FrozensetTests::test_difference, test/dynamo/test_sets.py::FrozensetTests::test_equality, test/dynamo/test_sets.py::FrozensetTests::test_in_frozenset, test/dynamo/test_sets.py::FrozensetTests::test_intersection, test/dynamo/test_sets.py::FrozensetTests::test_isdisjoint, test/dynamo/test_sets.py::FrozensetTests::test_issubset, test/dynamo/test_sets.py::FrozensetTests::test_issuperset, test/dynamo/test_sets.py::FrozensetTests::test_symmetric_difference, test/dynamo/test_sets.py::FrozensetTests::test_to_frozenset, test/dynamo/test_sets.py::FrozensetTests::test_to_set, test/dynamo/test_sets.py::FrozensetTests::test_union, test/dynamo/test_sets.py::SetTests::test_add, test/dynamo/test_sets.py::SetTests::test_binop_and, test/dynamo/test_sets.py::SetTests::test_binop_or, test/dynamo/test_sets.py::SetTests::test_binop_sub, test/dynamo/test_sets.py::SetTests::test_binop_xor, test/dynamo/test_sets.py::SetTests::test_clear, test/dynamo/test_sets.py::SetTests::test_cmp_eq, test/dynamo/test_sets.py::SetTests::test_cmp_greater_than, test/dynamo/test_sets.py::SetTests::test_cmp_greater_than_or_equal, test/dynamo/test_sets.py::SetTests::test_cmp_less_than, test/dynamo/test_sets.py::SetTests::test_cmp_less_than_or_equal, test/dynamo/test_sets.py::SetTests::test_cmp_ne, test/dynamo/test_sets.py::SetTests::test_constructor_iterable, test/dynamo/test_sets.py::SetTests::test_contains, test/dynamo/test_sets.py::SetTests::test_copy, test/dynamo/test_sets.py::SetTests::test_difference, test/dynamo/test_sets.py::SetTests::test_difference_update, test/dynamo/test_sets.py::SetTests::test_discard, test/dynamo/test_sets.py::SetTests::test_equality, test/dynamo/test_sets.py::SetTests::test_in_frozenset, test/dynamo/test_sets.py::SetTests::test_intersection, test/dynamo/test_sets.py::SetTests::test_intersection_update, test/dynamo/test_sets.py::SetTests::test_isdisjoint, test/dynamo/test_sets.py::SetTests::test_issubset, test/dynamo/test_sets.py::SetTests::test_issuperset, test/dynamo/test_sets.py::SetTests::test_pop, test/dynamo/test_sets.py::SetTests::test_remove, test/dynamo/test_sets.py::SetTests::test_symmetric_difference, test/dynamo/test_sets.py::SetTests::test_symmetric_difference_update, test/dynamo/test_sets.py::SetTests::test_to_frozenset, test/dynamo/test_sets.py::SetTests::test_to_set, test/dynamo/test_sets.py::SetTests::test_union, test/dynamo/test_sets.py::SetTests::test_update, test/dynamo/test_sets.py::UserDefinedSetTests::test_add, test/dynamo/test_sets.py::UserDefinedSetTests::test_binop_and, test/dynamo/test_sets.py::UserDefinedSetTests::test_binop_or, test/dynamo/test_sets.py::UserDefinedSetTests::test_binop_sub, test/dynamo/test_sets.py::UserDefinedSetTests::test_binop_xor, test/dynamo/test_sets.py::UserDefinedSetTests::test_clear, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_eq, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_greater_than, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_greater_than_or_equal, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_less_than, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_less_than_or_equal, test/dynamo/test_sets.py::UserDefinedSetTests::test_cmp_ne, test/dynamo/test_sets.py::UserDefinedSetTests::test_constructor_iterable, test/dynamo/test_sets.py::UserDefinedSetTests::test_contains, test/dynamo/test_sets.py::UserDefinedSetTests::test_copy, test/dynamo/test_sets.py::UserDefinedSetTests::test_difference, test/dynamo/test_sets.py::UserDefinedSetTests::test_difference_update, test/dynamo/test_sets.py::UserDefinedSetTests::test_discard, test/dynamo/test_sets.py::UserDefinedSetTests::test_equality, test/dynamo/test_sets.py::UserDefinedSetTests::test_in_frozenset, test/dynamo/test_sets.py::UserDefinedSetTests::test_intersection, test/dynamo/test_sets.py::UserDefinedSetTests::test_intersection_update, test/dynamo/test_sets.py::UserDefinedSetTests::test_isdisjoint, test/dynamo/test_sets.py::UserDefinedSetTests::test_issubset, test/dynamo/test_sets.py::UserDefinedSetTests::test_issuperset, test/dynamo/test_sets.py::UserDefinedSetTests::test_pop, test/dynamo/test_sets.py::UserDefinedSetTests::test_remove, test/dynamo/test_sets.py::UserDefinedSetTests::test_symmetric_difference, test/dynamo/test_sets.py::UserDefinedSetTests::test_symmetric_difference_update, test/dynamo/test_sets.py::UserDefinedSetTests::test_to_frozenset, test/dynamo/test_sets.py::UserDefinedSetTests::test_to_set, test/dynamo/test_sets.py::UserDefinedSetTests::test_union, test/dynamo/test_sets.py::UserDefinedSetTests::test_update, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_binop_and, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_binop_or, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_binop_sub, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_binop_xor, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_eq, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_greater_than, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_greater_than_or_equal, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_less_than, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_less_than_or_equal, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_cmp_ne, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_constructor_iterable, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_contains, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_copy, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_difference, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_equality, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_in_frozenset, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_intersection, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_isdisjoint, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_issubset, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_issuperset, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_symmetric_difference, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_to_frozenset, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_to_set, test/dynamo/test_sets.py::UserDefinedFrozensetTests::test_union 2025-12-04T10:17:32.2593208Z 2025-12-04T10:17:32.2593491Z Finished dynamo/test_sets 1/1 ... [2025-12-04 10:17:32.252381][3931.149949329], took 0.19min 2025-12-04T10:17:32.2594413Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_sets/dynamo.test_sets-c3cfb01bc5b67b47.xml 2025-12-04T10:17:32.3381349Z Running dynamo/test_compiler_bisector 1/1 ... [2025-12-04 10:17:32.337722][3931.235290728] 2025-12-04T10:17:32.3381863Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T10:17:32.3389775Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_compiler_bisector.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:17:32.338070] 2025-12-04T10:17:47.3331462Z 2025-12-04T10:17:47.3332526Z dynamo/test_compiler_bisector 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_compiler_bisector_1.1_b2d443336d449e1c_.log 2025-12-04T10:17:47.3336048Z Running 9 items in this shard: test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_bad_decomp, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_bad_lowering, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_bisect_pre_grad_graph, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_crossref, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_eager_backend, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_emulate_precision_casts, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_joint_graph, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_pre_grad, test/dynamo/test_compiler_bisector.py::TestCompilerBisector::test_rng 2025-12-04T10:17:47.3338975Z 2025-12-04T10:17:47.3339301Z Finished dynamo/test_compiler_bisector 1/1 ... [2025-12-04 10:17:47.332506][3946.229998932], took 0.25min 2025-12-04T10:17:47.3350822Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_compiler_bisector/dynamo.test_compiler_bisector-8187201dc9315a4d.xml 2025-12-04T10:17:47.4275114Z Running dynamo/test_decorators 1/1 ... [2025-12-04 10:17:47.427102][3946.32467109] 2025-12-04T10:17:47.4275591Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T10:17:47.4278322Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_decorators.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:17:47.427447] 2025-12-04T10:18:02.9697926Z 2025-12-04T10:18:02.9698992Z dynamo/test_decorators 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_decorators_1.1_6109b86b03700c73_.log 2025-12-04T10:18:02.9723855Z Running 73 items in this shard: test/dynamo/test_decorators.py::DecoratorTests::test_allow_in_graph, test/dynamo/test_decorators.py::DecoratorTests::test_allow_in_graph_no_id_reuse, test/dynamo/test_decorators.py::DecoratorTests::test_assume_constant_result_on_computation_with_graph_input, test/dynamo/test_decorators.py::DecoratorTests::test_assume_constant_result_on_user_defined_fn, test/dynamo/test_decorators.py::DecoratorTests::test_class_methods, test/dynamo/test_decorators.py::DecoratorTests::test_disable_for_custom_op, test/dynamo/test_decorators.py::DecoratorTests::test_disable_ignores_outer_wraps, test/dynamo/test_decorators.py::DecoratorTests::test_disable_nn_module_with_class_decorator, test/dynamo/test_decorators.py::DecoratorTests::test_disable_nn_modules_forward_hook, test/dynamo/test_decorators.py::DecoratorTests::test_disable_optimize, test/dynamo/test_decorators.py::DecoratorTests::test_disable_recursive_false, test/dynamo/test_decorators.py::DecoratorTests::test_disable_recursive_false_weird, test/dynamo/test_decorators.py::DecoratorTests::test_disable_recursive_flags, test/dynamo/test_decorators.py::DecoratorTests::test_disallow_in_graph, test/dynamo/test_decorators.py::DecoratorTests::test_dont_skip_tracing, test/dynamo/test_decorators.py::DecoratorTests::test_dynamo_disable_annotations, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_empty_graph, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_error, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_export, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_fullgraph, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_nested, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_nested_deep, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_nested_with_skip, test/dynamo/test_decorators.py::DecoratorTests::test_error_on_graph_break_nonempty_checkpoint, test/dynamo/test_decorators.py::DecoratorTests::test_fail_on_recompile_shows_guard_details, test/dynamo/test_decorators.py::DecoratorTests::test_graph_break, test/dynamo/test_decorators.py::DecoratorTests::test_incorrect_usage_disallow_in_graph, test/dynamo/test_decorators.py::DecoratorTests::test_mark_static_address_guarded, test/dynamo/test_decorators.py::DecoratorTests::test_mark_static_address_unguarded, test/dynamo/test_decorators.py::DecoratorTests::test_mark_static_nn_module, test/dynamo/test_decorators.py::DecoratorTests::test_nested_compile_error_on_graph_break, test/dynamo/test_decorators.py::DecoratorTests::test_nested_compile_fullgraph, test/dynamo/test_decorators.py::DecoratorTests::test_nested_disable_decorator, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_newly_constructed_trace_register_constant_type_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_captured_external_tensor, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_custom_class_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_custom_class_output_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_inside_compiled_function, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_inside_compiled_function_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_inside_compiled_function_kwarg, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_int_and_float_output, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_nested_custom_class, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_nested_custom_class_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_newly_constructed_custom_class_with_side_effects, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_newly_constructed_dict_with_side_effects, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_no_action_at_a_distance, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_object_in_context_error, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_on_method, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_pre_existing_custom_class, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_pre_existing_custom_class_with_side_effects, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_pre_existing_dict, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_pre_existing_dict_with_side_effects, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_pre_existing_register_constant_type_guard, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_tensor_args, test/dynamo/test_decorators.py::DecoratorTests::test_nonstrict_trace_tuple_and_sym_int_output, test/dynamo/test_decorators.py::DecoratorTests::test_patch_dynamo_config_errors, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_aot_eager_then_compile, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_eager_on_recompile, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_eager_then_compile, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_eager_then_compile_with_graph_break, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_fail_on_recompile, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_fail_on_recompile_with_disable, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_forbid_in_graph, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_force_backend, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_force_backend_with_disable, test/dynamo/test_decorators.py::DecoratorTests::test_set_stance_force_eager, test/dynamo/test_decorators.py::DecoratorTests::test_skip_frame, test/dynamo/test_decorators.py::DecoratorTests::test_step_unsupported, test/dynamo/test_decorators.py::DecoratorTests::test_step_unsupported_empty_checkpoint, test/dynamo/test_decorators.py::DecoratorTests::test_substitute_in_graph, test/dynamo/test_decorators.py::DecoratorTests::test_torch_guards_stack_frame_register_inlining_disable, test/dynamo/test_decorators.py::DecoratorTests::test_torch_guards_stack_frame_register_inlining_partially_disable 2025-12-04T10:18:02.9748193Z 2025-12-04T10:18:02.9748478Z Finished dynamo/test_decorators 1/1 ... [2025-12-04 10:18:02.969493][3961.867061538], took 0.26min 2025-12-04T10:18:02.9749487Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_decorators/dynamo.test_decorators-0e583763ebd3b3fc.xml 2025-12-04T10:18:03.0526831Z Running test_transformers 1/1 ... [2025-12-04 10:18:03.052251][3961.949819699] 2025-12-04T10:18:03.0527273Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T10:18:03.0530498Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_transformers.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:18:03.052671] 2025-12-04T10:21:18.1266309Z 2025-12-04T10:21:18.1267071Z test_transformers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_transformers_1.1_173f13e8d6354705_.log 2025-12-04T10:21:19.0882173Z Running 12415 items in this shard: test/test_transformers.py::TestTransformersCUDA::test_bias_is_none_cuda, test/test_transformers.py::TestTransformersCUDA::test_decoder_only_layer_cuda, test/test_transformers.py::TestTransformersCUDA::test_decoder_padding_and_src_mask_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_disable_fastpath_cuda, test/test_transformers.py::TestTransformersCUDA::test_encoder_is_causal_cuda, test/test_transformers.py::TestTransformersCUDA::test_encoder_padding_and_src_mask_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_is_causal_gpu_cuda, test/test_transformers.py::TestTransformersCUDA::test_kpm_mask_trailing_column_with_nested_tensor_cuda, test/test_transformers.py::TestTransformersCUDA::test_mask_check_fastpath_cuda, test/test_transformers.py::TestTransformersCUDA::test_math_backend_high_precision_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_1_bias_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_1_bias_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_8_bias_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_8_bias_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim1_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim1_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim_2_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim_2_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim1_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim1_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim_2_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim_2_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim1_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim1_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim_2_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim_2_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_2D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_2D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_2D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_2D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_2D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_2D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_3D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_3D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_3D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_3D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_3D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_3D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_no_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_no_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_3D_input_dim_no_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_2D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_2D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_2D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_2D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_2D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_2D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_4D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_4D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_4D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_4D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_4D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_4D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_no_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_no_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_0_4D_input_dim_no_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_2D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_2D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_2D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_2D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_2D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_2D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_3D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_3D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_3D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_3D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_3D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_3D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_no_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_no_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_3D_input_dim_no_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_2D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_2D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_2D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_2D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_2D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_2D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_4D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_4D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_4D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_4D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_4D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_4D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_no_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_no_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_batch_size_5_4D_input_dim_no_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_script_encoder_subclass_cuda, test/test_transformers.py::TestTransformersCUDA::test_script_mha_in_proj_weight_none_cuda, test/test_transformers.py::TestTransformersCUDA::test_self_attn_TxT_attn_mask_cuda, test/test_transformers.py::TestTransformersCUDA::test_train_with_is_causal_cuda, test/test_transformers.py::TestTransformersCUDA::test_train_with_pad_and_catch_error_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformer_bias_is_none_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_False_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_True_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_False_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_True_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_False_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_False_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_True_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_True_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_False_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_False_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_True_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_True_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_False_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_False_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_True_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_True_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_no_fastpath_with_hooks_nhead_3_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_no_fastpath_with_hooks_nhead_4_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_src_mask_nhead_1_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_src_mask_nhead_4_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_src_mask_nhead_8_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_subclass_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_subclass_model_cuda, test/test_transformers.py::TestTransformersCUDA::test_with_nested_tensor_input_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_dispatch_fails_no_backend_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_atteention_large_bf16_nan_values_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_attention_fail_with_non_square_causal_attention_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_autocast_fp32_bfloat16_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_autocast_fp32_float16_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_193_dropout_p_0_0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_193_dropout_p_0_2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_256_dropout_p_0_0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_256_dropout_p_0_2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_fail_fp32_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_nested_broadcasting_error_cases_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_nested_broadcasting_requires_grad_failure_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_seq_len_0_inputs_fused_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_seq_len_0_inputs_fused_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_attn_mask_present_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_broadcast_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_broadcast_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_broadcast_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_dim_3_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_dim_3_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_dim_3_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_head_dim_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_head_dim_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_head_dim_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_invalid_dtype_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_invalid_dtype_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_invalid_dtype_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_1_dimensional_inputs_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_1_dimensional_inputs_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_1_dimensional_inputs_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_datatypes_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_datatypes_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_datatypes_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_devices_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_devices_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_devices_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_last_dim_stride_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_last_dim_stride_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_last_dim_stride_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sdpa_kernel_grouped_query_attention_cuda_fused_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sequence_lengths_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sequence_lengths_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sequence_lengths_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mask_invalid_last_dim_stride_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mask_invalid_last_dim_stride_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_eff_attention_fail_with_batch_size_geq_65536_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_eff_attention_fail_with_batch_size_geq_65536_error_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_eff_attention_large_seq_len_uniform_attention_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_efficient_fail_bfloat16_less_than_sm80_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_nested_fails_on_padding_head_dim_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_unaligned_tensors_cuda, test/test_transformers.py::TestSDPACUDA::test_scaled_dot_product_attention_fp16_overflow_cuda, test/test_transformers.py::TestSDPACUDA::test_scaled_dot_product_attention_math_with_negative_scale_kernel0_cuda, test/test_transformers.py::TestSDPACUDA::test_sdp_math_gradcheck_contiguous_inputs_False_cuda, test/test_transformers.py::TestSDPACUDA::test_sdp_math_gradcheck_contiguous_inputs_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_broken_166211_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_compiles_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_d256_heuristic_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_different_dk_dv_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_fail_d128_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_gqa_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_nonmodulo64seqlen_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_preserves_query_layout_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_seqlen1_dropout_heuristic_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_trivial_output_transpose_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_different_dk_dv_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel0_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel0_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel1_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel1_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel2_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel2_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_query_dense_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_seq_len_1_inputs_fused_kernel0_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_seq_len_1_inputs_fused_kernel1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_choice_type_dense_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_choice_type_nested_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_priority_order_use_compile_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_priority_order_use_compile_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_long_sequence_mask_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_long_sequence_mask_float32_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_non_contig_mask_bug_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_non_contiguous_mask_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_non_contiguous_mask_float32_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_backwards_determinism_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_2_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_3_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_4_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_cudnn_nested_type_nested_is_contiguous_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_cudnn_nested_type_nested_is_contiguous_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_dense_fused_kernel0_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_dense_fused_kernel1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_nested_fused_kernel0_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_nested_fused_kernel1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_dense_is_contiguous_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_dense_is_contiguous_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_nested_is_contiguous_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_nested_is_contiguous_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_choice_with_determinism_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_choice_with_determinism_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_False_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_False_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_True_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_True_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_False_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_False_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_True_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_True_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_False_is_causal_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_False_is_causal_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_True_is_causal_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_True_is_causal_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_singelton_head_dim_stride_ne_1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_and_mask_fails_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape3_cuda 2025-12-04T10:21:20.0066523Z 2025-12-04T10:21:20.0066827Z Finished test_transformers 1/1 ... [2025-12-04 10:21:18.169869][4157.067429742], took 3.25min 2025-12-04T10:21:20.0067772Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_transformers/test_transformers-4fb24eefb9a01bd4.xml 2025-12-04T10:21:20.0068687Z Running test_ops_fwd_gradients 3/12 ... [2025-12-04 10:21:18.578640][4157.476202574] 2025-12-04T10:21:20.0069130Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T10:21:20.0070109Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_fwd_gradients.py', '--shard-id=3', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:21:18.579111] 2025-12-04T10:32:52.5904553Z 2025-12-04T10:32:52.5908123Z test_ops_fwd_gradients 3/12 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_fwd_gradients_3.12_23fc26b77348a9ee_.log 2025-12-04T10:32:52.6022693Z Running 292 items in this shard: test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_H_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__unsafe_masked_index_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_alias_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_alias_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_allclose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_aminmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_broadcast_to_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cholesky_inverse_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_column_stack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_copysign_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cov_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diag_embed_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diagonal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_dsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_empty_permuted_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_empty_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_expand_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_expand_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_expm1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_fftshift_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_hfft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_hfftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ifftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ifftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ifftshift_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_irfft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_rfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fliplr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_flipud_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_full_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_half_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_hstack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_igammac_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_fill_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_reduce_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_int_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isclose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isinf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isnan_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ldexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_matrix_power_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_pinv_hermitian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_svdvals_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_vector_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logdet_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logsumexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_argmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_std_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_std_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_sum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_var_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_median_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_min_binary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_movedim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ne_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_empty_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_empty_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv_transpose1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_dropout2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_local_response_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_constant_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_reflect_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_softmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_softshrink_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nonzero_static_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nonzero_static_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_outer_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_permute_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_rand_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_randn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ravel_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scatter_reduce_sum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_select_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_select_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_exponential_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sinc_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_softmax_with_dtype_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_legendre_polynomial_p_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_modified_bessel_k0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_xlog1py_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_split_with_sizes_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_std_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_std_mean_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_to_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_trapezoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_trapz_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_true_divide_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unfold_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unsafe_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_view_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_view_as_real_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_vsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_where_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_acos_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addmm_decomposed_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_angle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_argmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_argmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_as_strided_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_as_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_bmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_bool_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_complex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_conj_physical_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_corrcoef_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diagflat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_eq_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_hfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_hfftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_hfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_irfftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_hstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_igammac_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_add_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_add_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_reduce_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_select_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_inner_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isclose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_cholesky_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_cholesky_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_householder_product_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lstsq_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_matrix_power_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_pinv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_pinv_hermitian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_max_binary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_msort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_native_dropout_backward_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_hardsigmoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_instance_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_linear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_reflect_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_silu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_softplus_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_tanhshrink_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_norm_fro_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ormqr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_permute_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_put_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_remainder_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_repeat_interleave_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_resolve_neg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scatter_reduce_sum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sgn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sinc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sinh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_entr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sqrt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tensor_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_to_sparse_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_topk_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_trace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_trunc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unfold_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_uniform_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unique_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_vstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rdiv___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__unsafe_masked_index_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_abs_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_aminmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_as_strided_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_bool_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_bucketize_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cfloat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cumulative_trapezoid_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diagflat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diagonal_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_div_trunc_rounding_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_dot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_double_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_empty_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_expand_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_eye_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_fftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_fftshift_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_hfftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ifft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ifft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ihfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_rfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_rfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_flip_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_float_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_histc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_fill_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_reduce_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_item_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_cholesky_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_cond_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_diagonal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_ldl_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lstsq_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_factor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_factor_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_matrix_power_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_matrix_rank_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_multi_dot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_solve_triangular_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_vecdot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linspace_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log10_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logcumsumexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logspace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logspace_tensor_overload_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_argmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_matrix_exp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_movedim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ne_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_new_ones_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_avg_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_batch_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_dropout_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_elu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_glu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_grid_sample_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_l1_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_max_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pairwise_distance_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pdist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_relu6_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_rms_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nonzero_static_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_normal_number_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ones_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_outer_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_permute_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_positive_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_pow_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_qr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rand_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resize__cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rot90_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_short_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_legendre_polynomial_p_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_split_with_sizes_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_split_with_sizes_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_squeeze_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_svd_lowrank_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tensordot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_to_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_transpose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unfold_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unique_consecutive_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_var_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_var_mean_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_vstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_where_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_xlogy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_zeros_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_zeros_cuda_float64 2025-12-04T10:32:52.6136104Z 2025-12-04T10:32:52.6136394Z Finished test_ops_fwd_gradients 3/12 ... [2025-12-04 10:32:52.590642][4851.48821049], took 11.57min 2025-12-04T10:32:52.6137461Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_ops_fwd_gradients/test_ops_fwd_gradients-5dbc324bf3632362.xml 2025-12-04T10:32:53.1733136Z Uploading artifacts took 0.50 seconds 2025-12-04T10:32:53.1737184Z Running test_ops_fwd_gradients 11/12 ... [2025-12-04 10:32:53.173391][4852.070957567] 2025-12-04T10:32:53.1737667Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T10:32:53.1741905Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_fwd_gradients.py', '--shard-id=11', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:32:53.173811] 2025-12-04T10:42:08.7966959Z 2025-12-04T10:42:08.7967979Z test_ops_fwd_gradients 11/12 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_fwd_gradients_11.12_f79076c6c96ab27b_.log 2025-12-04T10:42:08.8070501Z Running 266 items in this shard: test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rmul___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__segment_reduce_offsets_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_add_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addbmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_all_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_argmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_asin_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_bfloat16_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_block_diag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_byte_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_clamp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_clamp_max_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_clone_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_combinations_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cross_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diagonal_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_dist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_erfc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_fft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_fftshift_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_flip_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_full_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_gather_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_gather_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_put_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_reduce_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_select_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isinf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lgamma_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_cholesky_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_eigh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_householder_product_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_matrix_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_matrix_rank_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_multi_dot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_qr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_svd_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_svd_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logaddexp2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logaddexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logical_not_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_long_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mT_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_fill_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_matmul_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_matrix_exp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mul_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_gelu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_leaky_relu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_circular_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_normal_in_place_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ones_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ones_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ormqr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_permute_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_positive_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_rad2deg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_resolve_neg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_round_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signbit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sinh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sparse_sampled_addmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_airy_ai_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_bessel_y0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_i1e_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_ndtr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_split_with_sizes_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_squeeze_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sub_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_take_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_trace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_triu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unbind_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unflatten_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unsqueeze_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_zero__cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_zeros_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_alias_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_any_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_argsort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_argwhere_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_as_strided_partial_views_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atleast_2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_bfloat16_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_block_diag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_byte_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cdist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cholesky_inverse_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_chunk_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_clone_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_double_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_equal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_fftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ifft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_irfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_grid_sampler_2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_grid_sampler_3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_reduce_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isfinite_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ldexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lgamma_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_cross_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_eigvalsh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_ldl_factor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_ldl_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lu_factor_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lu_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_matrix_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_vecdot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log10_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log1p_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log1p_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logaddexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logical_or_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logical_xor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lu_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_softmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_max_reduction_with_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_maximum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_new_ones_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv1d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_constant_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_reflect_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_rms_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nonzero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nonzero_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ones_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ones_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ormqr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_polygamma_polygamma_n_1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_rand_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_repeat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scatter_reduce_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_short_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sigmoid_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_bartlett_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_exponential_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_general_cosine_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sinh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_slice_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_bessel_j1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_log_ndtr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_ndtri_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_squeeze_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_std_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_t_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_take_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_triangular_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unsqueeze_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_mean_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_where_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_zeros_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_zeros_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_zeros_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rmatmul___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addmm_decomposed_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_allclose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_argmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_argsort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_as_strided_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atanh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atleast_2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atleast_3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_bmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_byte_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_byte_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_chunk_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_conj_physical_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_contiguous_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_count_nonzero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diag_embed_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diagonal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_dist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_empty_permuted_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_expm1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_eye_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_fft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fliplr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_flipud_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ge_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_half_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_i0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_reduce_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_cholesky_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_inv_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_pinv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_svd_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_tensorinv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_tensorsolve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log_normal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log_softmax_with_dtype_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logical_and_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_lu_unpack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mH_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_cumsum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_var_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_matrix_exp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_min_reduction_no_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_msort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nan_to_num_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nanmean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_native_layer_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_interpolate_linear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_kl_div_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_leaky_relu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_linear_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_constant_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_threshold_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_pca_lowrank_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_permute_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_polygamma_polygamma_n_0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ravel_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_reshape_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_reshape_as_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resize_as__cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resolve_conj_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resolve_neg_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resolve_neg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_round_decimals_0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_round_decimals_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rsub_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scatter_add_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scatter_reduce_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_std_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_stft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_t_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_triangular_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unfold_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unsafe_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unsqueeze_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_var_mean_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_vsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_zero__cuda_complex128 2025-12-04T10:42:08.8170817Z 2025-12-04T10:42:08.8171090Z Finished test_ops_fwd_gradients 11/12 ... [2025-12-04 10:42:08.796575][5407.69414296], took 9.26min 2025-12-04T10:42:08.8172071Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_ops_fwd_gradients/test_ops_fwd_gradients-2cb8345fee0d20d1.xml 2025-12-04T10:42:08.9044463Z Running test_ops_gradients 7/10 ... [2025-12-04 10:42:08.904118][5407.801687198] 2025-12-04T10:42:08.9045010Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T10:42:08.9048068Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_gradients.py', '--shard-id=7', '--num-shards=10', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:42:08.904442] 2025-12-04T10:50:40.0000747Z 2025-12-04T10:50:40.0001606Z test_ops_gradients 7/10 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_gradients_7.10_0772651414b4f002_.log 2025-12-04T10:50:40.0202688Z Running 550 items in this shard: test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyNonzeroCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___getitem___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__unsafe_masked_index_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_acos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_asinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bool_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_corrcoef_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_div_floor_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_einsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_eq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_float_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isposinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigvalsh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_or_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_long_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_movedim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_multinomial_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_narrow_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_embedding_bag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_rms_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nonzero_static_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_outer_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polygamma_polygamma_n_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_renorm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_repeat_interleave_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resize__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyNonzeroCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rpow___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__native_batch_norm_legit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_broadcast_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ceil_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cfloat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumulative_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_double_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_permuted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_equal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_floor_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_int_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_invoke_quant_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_kthvalue_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ldexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cond_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_map_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_map_triple_nested_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_glu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_leaky_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pairwise_distance_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_fro_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_permute_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_polar_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rand_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_randint_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rsub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_slice_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_hermite_polynomial_h_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_i1e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_list_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tril_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsafe_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_as_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_while_loop_stack_output_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyMulCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyNonzeroCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cov_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_full_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hash_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_invoke_quant_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigvalsh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_inv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vector_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logaddexp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logcumsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_map_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_max_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_multinomial_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polar_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rot90_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_with_sizes_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_t_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_uniform_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyCatCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyCubeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rsub___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__chunk_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__native_batch_norm_legit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__softmax_backward_data_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_abs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addmv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argwhere_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cummax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_deg2rad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diag_embed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagflat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_full_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_hash_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isnan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eig_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eig_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_solve_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_tensorinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mH_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_native_dropout_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_glu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_reflect_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_replicate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_permute_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_permute_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_round_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_round_decimals_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_round_decimals_neg_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_hermite_polynomial_he_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_legendre_polynomial_p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_modified_bessel_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_multiple_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tile_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapz_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unflatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argwhere_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_asin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_broadcast_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cdouble_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_permuted_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fliplr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_item_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cond_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_householder_product_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_and_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_matrix_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_grid_sample_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_group_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_hardshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_real_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_modified_bessel_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_modified_bessel_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_square_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zeros_like_cuda_float64 2025-12-04T10:50:40.0397514Z 2025-12-04T10:50:40.0397794Z Finished test_ops_gradients 7/10 ... [2025-12-04 10:50:40.000268][5918.897836377], took 8.52min 2025-12-04T10:50:40.0398735Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_ops_gradients/test_ops_gradients-d3fb84b182f25804.xml 2025-12-04T10:50:40.1065286Z Running test_linalg 1/1 ... [2025-12-04 10:50:40.106178][5919.00374655] 2025-12-04T10:50:40.1065724Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T10:50:40.1069193Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_linalg.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:50:40.106518] 2025-12-04T10:54:11.3338787Z 2025-12-04T10:54:11.3342485Z test_linalg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_linalg_1.1_598146089d6b6d6f_.log 2025-12-04T10:54:11.3742370Z Running 1263 items in this shard: test/test_linalg.py::TestLinalgCUDA::test_1_sized_with_0_strided_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_1_sized_with_0_strided_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_1_k_128_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_1_k_128_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_1_k_64_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_1_k_64_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_32_k_128_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_32_k_128_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_32_k_64_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_32_k_64_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_256_n_128_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_256_n_32_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_256_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_256_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_64_n_128_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_64_n_32_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_64_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_64_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_32_k_32_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_32_k_32_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_32_k_64_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_32_k_64_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_64_k_32_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_64_k_32_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_64_k_64_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_64_k_64_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_large_shape_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_48_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_48_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_48_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_48_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_64_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_64_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_64_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_64_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_48_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_48_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_48_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_48_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_64_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_64_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_64_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_64_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_48_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_48_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_48_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_48_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_64_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_64_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_64_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_64_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_48_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_48_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_48_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_48_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_64_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_64_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_64_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_64_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_errors_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_16_use_transpose_a_False_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_16_use_transpose_a_False_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_16_use_transpose_a_True_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_16_use_transpose_a_True_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_32_use_transpose_a_False_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_32_use_transpose_a_False_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_32_use_transpose_a_True_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_32_use_transpose_a_True_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_16_use_transpose_a_False_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_16_use_transpose_a_False_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_16_use_transpose_a_True_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_16_use_transpose_a_True_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_32_use_transpose_a_False_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_32_use_transpose_a_False_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_32_use_transpose_a_True_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_32_use_transpose_a_True_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmm_baddbmm_overflow_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_addmm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_addmm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmm_gelu_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_gelu_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_gelu_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_gelu_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_tunableop_rocm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_tunableop_rocm_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_tunableop_rocm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_tunableop_rocm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmm_sizes_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_addmm_sizes_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_addmm_sizes_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_sizes_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmv_rowmajor_colmajor_incx_incy_lda_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmv_rowmajor_colmajor_incx_incy_lda_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmv_rowmajor_colmajor_incx_incy_lda_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addr_bool_cuda_bool, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addr_integral_cuda_int16, test/test_linalg.py::TestLinalgCUDA::test_addr_integral_cuda_int32, test/test_linalg.py::TestLinalgCUDA::test_addr_integral_cuda_int64, test/test_linalg.py::TestLinalgCUDA::test_addr_integral_cuda_int8, test/test_linalg.py::TestLinalgCUDA::test_addr_integral_cuda_uint8, test/test_linalg.py::TestLinalgCUDA::test_addr_type_promotion_cuda, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_int16, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_int32, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_int64, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_nan_input_with_zero_beta_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_blas_empty_cuda, test/test_linalg.py::TestLinalgCUDA::test_blas_mv_large_input_cuda, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_blaslog_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_bmm_tunableop_rocm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_broadcast_batched_matmul_cuda, test/test_linalg.py::TestLinalgCUDA::test_broadcast_fused_matmul_cuda, test/test_linalg.py::TestLinalgCUDA::test_call_count_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_chain_matmul_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_non_pd_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_non_pd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_non_pd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_non_pd_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_backward_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_many_batches_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_many_batches_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_many_batches_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_many_batches_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_out_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_out_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_out_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_out_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_ck_blas_library_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_1_k_128_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_1_k_128_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_1_k_64_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_1_k_64_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_32_k_128_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_32_k_128_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_32_k_64_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_32_k_64_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_32_k_32_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_32_k_32_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_32_k_64_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_32_k_64_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_64_k_32_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_64_k_32_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_64_k_64_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_64_k_64_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_cond_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cond_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cond_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cond_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cond_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cond_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cond_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cond_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cross_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cross_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cross_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_cross_with_and_without_dim_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cross_with_and_without_dim_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_det_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_det_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_det_logdet_slogdet_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_det_logdet_slogdet_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_disable_tuning_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_dot_invalid_args_cuda, test/test_linalg.py::TestLinalgCUDA::test_dot_vs_numpy_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_dot_vs_numpy_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_dump_results_on_exit_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eig_check_magma_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eig_compare_backends_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eig_compare_backends_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eig_compare_backends_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eig_compare_backends_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eig_cuda_complex_eigenvectors_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eig_cuda_complex_eigenvectors_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eig_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eig_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eig_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eig_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eig_numpy_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eig_numpy_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eig_removed_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_eig_with_nan_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eig_with_nan_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eig_with_nan_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eig_with_nan_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigh_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigh_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigh_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigh_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigh_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigh_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigh_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigh_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigh_lower_uplo_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigh_lower_uplo_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigh_lower_uplo_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigh_lower_uplo_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigh_lwork_lapack_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigh_lwork_lapack_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigh_lwork_lapack_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigh_lwork_lapack_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigh_svd_illcondition_matrix_input_should_not_crash_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigh_svd_illcondition_matrix_input_should_not_crash_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigvals_compare_backends_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigvals_compare_backends_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigvals_compare_backends_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigvals_compare_backends_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigvals_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigvals_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigvals_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigvals_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigvals_numpy_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigvals_numpy_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_einsum_corner_cases_cuda, test/test_linalg.py::TestLinalgCUDA::test_einsum_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_einsum_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_einsum_error_cases_cuda, test/test_linalg.py::TestLinalgCUDA::test_einsum_output_layout_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_einsum_random_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_einsum_random_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_einsum_sublist_format_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_einsum_sublist_format_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_32_k_32_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_32_k_35_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_32_k_36_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_32_k_40_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_32_k_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_35_k_32_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_35_k_35_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_35_k_36_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_35_k_40_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_35_k_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_36_k_32_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_36_k_35_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_36_k_36_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_36_k_40_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_36_k_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_40_k_32_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_40_k_35_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_40_k_36_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_40_k_40_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_40_k_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_64_k_32_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_64_k_35_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_64_k_36_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_64_k_40_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_64_k_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_gemm_bias_offline_tunableop_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_gemm_bias_tunableop_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_geqrf_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_geqrf_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_geqrf_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_geqrf_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_householder_product_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_householder_product_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_householder_product_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_householder_product_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_householder_product_errors_and_warnings_cuda, test/test_linalg.py::TestLinalgCUDA::test_inner_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inner_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inv_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inv_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inv_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inv_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_info_device_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_info_device_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_info_device_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_info_device_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_singular_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_singular_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_singular_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_singular_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_invariance_error_spectral_decompositions_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inverse_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inverse_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inverse_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inverse_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_large_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_large_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_large_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_large_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_inverse_many_batches_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inverse_many_batches_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inverse_many_batches_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inverse_many_batches_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_kron_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_kron_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_kron_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_kron_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_kron_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_kron_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_kron_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_kron_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_kron_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_kron_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_kron_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_kron_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lapack_empty_cuda, test/test_linalg.py::TestLinalgCUDA::test_large_bmm_backward_cuda, test/test_linalg.py::TestLinalgCUDA::test_large_bmm_mm_backward_cuda, test/test_linalg.py::TestLinalgCUDA::test_ldl_factor_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_ldl_factor_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_ldl_factor_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_ldl_factor_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_ldl_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_ldl_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_ldl_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_ldl_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_cross_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_cross_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_cross_with_and_without_dim_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_cross_with_and_without_dim_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_batch_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_batch_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_batch_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_batch_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_input_checks_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_input_checks_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_input_checks_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_input_checks_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_cpu_errors_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_cpu_errors_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_cpu_errors_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_cpu_errors_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_family_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_family_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_family_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_family_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_analytic_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_analytic_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_analytic_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_analytic_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_batch_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_batch_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_boundary_cases_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_boundary_cases_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_boundary_cases_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_boundary_cases_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_compare_with_taylor_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_compare_with_taylor_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_compare_with_taylor_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_compare_with_taylor_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_no_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_perverse_nan_values_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_perverse_nan_values_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_perverse_nan_values_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_perverse_nan_values_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_utils_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_utils_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_qr_autograd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_large_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_large_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_large_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_large_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linear_algebra_scalar_raises_cuda, test/test_linalg.py::TestLinalgCUDA::test_lobpcg_basic_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lobpcg_ortho_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lobpcg_scipy_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lobpcg_torchscript_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_logaddexp_cpu_vs_cuda_complex_cuda, test/test_linalg.py::TestLinalgCUDA::test_lower_precision_accumulation_with_ref_path_cuda, test/test_linalg.py::TestLinalgCUDA::test_lstsq_removed_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_many_batches_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_many_batches_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_many_batches_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_many_batches_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_large_matrices_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_large_matrices_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_large_matrices_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_large_matrices_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lu_unpack_check_input_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matmul_45724_cuda, test/test_linalg.py::TestLinalgCUDA::test_matmul_check_entries_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_matmul_empty_existing_file_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_mv_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_matmul_mv_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_matmul_mv_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_offline_mgpu_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_offline_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_matmul_out_kernel_errors_with_autograd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matmul_out_kernel_errors_with_autograd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_1d_Nd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_1d_Nd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_2d_Nd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_2d_Nd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_3d_Nd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_3d_Nd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_tunableop_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_exp_backward_input_validation_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_exp_backward_input_validation_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matrix_exp_backward_input_validation_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_exp_backward_input_validation_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_norm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_norm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_power_negative_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_power_negative_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_power_non_negative_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_power_non_negative_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_atol_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_atol_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_atol_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_atol_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_atol_rtol_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_basic_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_basic_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_basic_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_basic_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_out_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_out_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_out_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_out_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_removed_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_minimum_tuning_iteration_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_mm_bmm_non_memory_dense_cuda, test/test_linalg.py::TestLinalgCUDA::test_mm_conjtranspose_cuda, test/test_linalg.py::TestLinalgCUDA::test_mm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_mm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_mm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_mm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_mm_empty_inputs_mixed_dtype_errors_cuda, test/test_linalg.py::TestLinalgCUDA::test_mm_submatrix_offline_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_multi_dot_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_multi_dot_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_multi_dot_errors_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_bfloat16_and_half_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_norm_bfloat16_and_half_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_norm_complex_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_norm_complex_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_norm_complex_old_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_complexhalf_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_norm_errors_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_errors_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_norm_extreme_values_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_fastpaths_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_fro_2_equivalence_old_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_fused_type_promotion_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_norm_fused_type_promotion_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_old_numpy_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_old_numpy_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_old_numpy_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_old_numpy_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_norm_old_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_old_nan_propagation_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_degenerate_shapes_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_degenerate_shapes_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_degenerate_shapes_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_degenerate_shapes_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_nuclear_norm_axes_small_brute_force_old_cuda, test/test_linalg.py::TestLinalgCUDA::test_nuclear_norm_exceptions_old_cuda, test/test_linalg.py::TestLinalgCUDA::test_nuclear_norm_out_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_nuclear_norm_out_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_numeric_check_leak_tunableop_rocm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_numerical_check_accuracy_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_numerical_check_accuracy_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_numerical_check_python_binding_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_offline_tuning_append_to_existing_file_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_many_batches_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_upper_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_upper_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_upper_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_upper_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_ops_append_to_existing_file_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_ormqr_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_ormqr_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_ormqr_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_ormqr_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_ormqr_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_ormqr_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_ormqr_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_ormqr_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_ger_addr_legacy_tests_cuda, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_uint8, test/test_linalg.py::TestLinalgCUDA::test_pca_lowrank_cuda, test/test_linalg.py::TestLinalgCUDA::test_permute_matmul_cuda, test/test_linalg.py::TestLinalgCUDA::test_pinv_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_pinv_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_pinv_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_pinv_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_pinv_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_pinv_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_pinv_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_pinv_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_pinverse_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_pinverse_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_pinverse_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_pinverse_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_preferred_blas_library_cuda, test/test_linalg.py::TestLinalgCUDA::test_preferred_linalg_library_cuda, test/test_linalg.py::TestLinalgCUDA::test_qr_batched_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_qr_batched_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_qr_batched_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_qr_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_qr_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_qr_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_qr_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_qr_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_qr_error_cases_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_qr_vs_numpy_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_qr_vs_numpy_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_qr_vs_numpy_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_qr_vs_numpy_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_renorm_cuda, test/test_linalg.py::TestLinalgCUDA::test_renorm_ps_cuda, test/test_linalg.py::TestLinalgCUDA::test_rotating_buffer_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_rowwise_scaled_gemm_numerics_tunableop_cuda_float8_e4m3fnuz, test/test_linalg.py::TestLinalgCUDA::test_scaled_gemm_offline_tunableop_cuda_float8_e4m3fnuz, test/test_linalg.py::TestLinalgCUDA::test_scaled_gemm_offline_tunableop_cuda_float8_e5m2fnuz, test/test_linalg.py::TestLinalgCUDA::test_scaled_gemm_tunableop_cuda_float8_e4m3fnuz, test/test_linalg.py::TestLinalgCUDA::test_scaled_gemm_tunableop_cuda_float8_e5m2fnuz, test/test_linalg.py::TestLinalgCUDA::test_slogdet_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_slogdet_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_slogdet_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_slogdet_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_slogdet_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_slogdet_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_slogdet_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_slogdet_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_solve_batched_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_solve_batched_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_solve_batched_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_solve_batched_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_solve_removed_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_strided_mm_bmm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_strided_mm_bmm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_svd_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_svd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_svd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_svd_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_svd_lowrank_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_svd_lowrank_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_svd_memory_allocation_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_svd_memory_allocation_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_svd_memory_allocation_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_svd_memory_allocation_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_symeig_removed_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_tensordot_cuda, test/test_linalg.py::TestLinalgCUDA::test_tensordot_out_kernel_errors_with_autograd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensordot_out_kernel_errors_with_autograd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_singular_input_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_singular_input_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_singular_input_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_singular_input_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tf32_offline_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tf32_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_many_batches_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_many_batches_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_many_batches_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_many_batches_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_large_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_out_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_out_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_out_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_out_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_bool, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_int16, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_int32, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_int64, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_int8, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_extreme_k_values_cuda_uint8, test/test_linalg.py::TestLinalgCUDA::test_triu_tril_large_matrix_64bit_cuda, test/test_linalg.py::TestLinalgCUDA::test_validator_tunableop_rocm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_vdot_invalid_args_cuda, test/test_linalg.py::TestLinalgCUDA::test_vdot_vs_numpy_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_vdot_vs_numpy_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_decom_unbacked_checks_cuda, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_dim_tuple_arg_cuda, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_extreme_values_cuda, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_reduce_over_1D_vector_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_reduce_over_1D_vector_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_reduce_over_1D_vector_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_reduce_over_1D_vector_cuda_float64 2025-12-04T10:54:11.4134647Z 2025-12-04T10:54:11.4134888Z Finished test_linalg 1/1 ... [2025-12-04 10:54:11.335149][6130.232717792], took 3.52min 2025-12-04T10:54:11.4135749Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_linalg/test_linalg-ecb4fcce602354d1.xml 2025-12-04T10:54:12.0127333Z Uploading artifacts took 0.55 seconds 2025-12-04T10:54:12.0131480Z Running functorch/test_ops 2/6 ... [2025-12-04 10:54:12.012830][6130.910397119] 2025-12-04T10:54:12.0131946Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T10:54:12.0135818Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ops.py', '--shard-id=2', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:54:12.013225] 2025-12-04T11:05:45.2129896Z 2025-12-04T11:05:45.2133196Z functorch/test_ops 2/6 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ops_2.6_77c851249c6366e1_.log 2025-12-04T11:05:45.2795252Z Running 1710 items in this shard: test/functorch/test_ops.py::TestOperatorsCUDA::test_extremal_numerics_log_softmax_cuda, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__segment_reduce_lengths_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cdouble_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_movedim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_pca_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp___radd___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_ihfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_permute_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_bessel_j1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_squeeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_jiterator_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_logsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_resize__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_amax_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_amin_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_clamp_cuda_complex128, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_floor_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_ordered_complex_raises_maximum_cuda_complex32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_T_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_diagonal_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_diagonal_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_hsplit_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_real_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_real_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_reshape_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_resolve_conj_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_resolve_neg_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_select_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_squeeze_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_squeeze_multiple_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_transpose_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_unflatten_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_view_as_complex_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_view_as_complex_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_atleast_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_jiterator_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_cond_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__batch_norm_with_update_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_clamp_min_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ifft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_irfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_grid_sampler_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_hash_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_normal_number_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_tensor_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_transpose_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_hash_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_jiterator_unary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_lt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_permute_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyCubeAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyTakeAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_T_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__native_batch_norm_legit_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__segment_reduce_offsets_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addbmm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addmm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_amin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_aminmax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_argmin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_as_strided_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atleast_2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atleast_3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bmm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_broadcast_tensors_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_chalf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_char_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_contiguous_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cummax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diag_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diagonal_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_digamma_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_eye_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fft2_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ifft2_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ifftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_flatten_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_floor_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_geometric_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_half_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_reduce_mean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isreal_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_unary_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_kthvalue_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ldexp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lerp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_eigvalsh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_matrix_power_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_svd_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_tensorsolve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lt_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mH_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_amax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_fill_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_logaddexp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_sum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_binary_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_reduction_with_dim_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_median_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_min_binary_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_min_reduction_no_dim_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_multinomial_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_new_zeros_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_max_pool2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_avg_pool2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_avg_pool3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_batch_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv_transpose1d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_cosine_similarity_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_ctc_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_dropout3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_nearest_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_l1_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pdist_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_softsign_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_threshold_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nonzero_static_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_normal_in_place_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ones_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ormqr_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_pca_lowrank_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_permute_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_prod_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rand_like_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_remainder_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_remainder_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rot90_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_amin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_mean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_select_scatter_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_short_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_cosine_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_general_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_general_hamming_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_hamming_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signbit_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sinc_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_slice_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sparse_sampled_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sparse_sampled_addmm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_entr_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_i1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_laguerre_polynomial_l_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_ndtr_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_ndtri_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_polygamma_special_polygamma_n_0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_v_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_zeta_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_split_list_args_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_squeeze_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_squeeze_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_std_mean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_stft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_t_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tensordot_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_torch_ops_aten__safe_softmax_default_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_triangular_solve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_triu_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trunc_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_uniform_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_vdot_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_angle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__softmax_backward_data_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_bessel_j1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_resize__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_logsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_squeeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_transpose_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_addmv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cdouble_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_item_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_max_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_normal_number_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_matrix_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_resize__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_argwhere_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cdouble_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_jiterator_unary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unbind_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvmapjvp_linalg_solve_cuda 2025-12-04T11:05:45.3386670Z 2025-12-04T11:05:45.3386946Z Finished functorch/test_ops 2/6 ... [2025-12-04 11:05:45.215676][6824.113243314], took 11.55min 2025-12-04T11:05:45.3387901Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-c5a286cc3fc0ce2c.xml 2025-12-04T11:05:45.3388805Z Running inductor/test_cpu_repro 2/2 ... [2025-12-04 11:05:45.337484][6824.235050854] 2025-12-04T11:05:45.3389257Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:05:45.3390244Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cpu_repro.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:05:45.337846] 2025-12-04T11:16:47.8913648Z 2025-12-04T11:16:47.8914521Z inductor/test_cpu_repro 2/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cpu_repro_2.2_6e51b985f7d36046_.log 2025-12-04T11:16:47.9135143Z Running 365 items in this shard: test/inductor/test_cpu_repro.py::CPUReproTests::test_ModularIndexing_range_issue_103133, test/inductor/test_cpu_repro.py::CPUReproTests::test__adaptive_avg_pool2d, test/inductor/test_cpu_repro.py::CPUReproTests::test_acosh_with_negative_large_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_argmax_argmin_with_nan_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_atomic_add_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_auto_zvec_vsx_simd, test/inductor/test_cpu_repro.py::CPUReproTests::test_bf16_zeros, test/inductor/test_cpu_repro.py::CPUReproTests::test_bitwise_logical_op_bool, test/inductor/test_cpu_repro.py::CPUReproTests::test_bool_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_broadcast_scalar_cpp_tile_2d_kernel, test/inductor/test_cpu_repro.py::CPUReproTests::test_channel_shuffle_cl_output, test/inductor/test_cpu_repro.py::CPUReproTests::test_channels_last_view_as_complex, test/inductor/test_cpu_repro.py::CPUReproTests::test_complex_cholesky_mh_view_fallback, test/inductor/test_cpu_repro.py::CPUReproTests::test_complex_memory_overlap, test/inductor/test_cpu_repro.py::CPUReproTests::test_concat_inner_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_consistent_remove_buffers, test/inductor/test_cpu_repro.py::CPUReproTests::test_constant_store, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv2d_packed, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_in_channel_1_dynamic_shapes, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_stride_constraints, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_int32_to_int64_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_int64_to_int32_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_cpp_kernel_profile, test/inductor/test_cpu_repro.py::CPUReproTests::test_decomposed_dequant_relu_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_maxpool2d_lowering_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_maxpool2d_lowering_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_quant_lowering_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_quant_lowering_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_relu_quant_dequant_relu_quant_lowering_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_relu_quant_dequant_relu_quant_lowering_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_disabled_amp_is_inference_True, test/inductor/test_cpu_repro.py::CPUReproTests::test_dropout, test/inductor/test_cpu_repro.py::CPUReproTests::test_embedding_vec_bf16, test/inductor/test_cpu_repro.py::CPUReproTests::test_for_loop_collapsed, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp32_load_with_to_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_bfloat16_shape_15,3,13, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_bfloat16_shape_4,2048,4096, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_float16_shape_15,3,13, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_float16_shape_4,2048,4096, test/inductor/test_cpu_repro.py::CPUReproTests::test_frexp, test/inductor/test_cpu_repro.py::CPUReproTests::test_fused_node, test/inductor/test_cpu_repro.py::CPUReproTests::test_group_norm_large_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_group_norm_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_horizontal_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_in_out_buffer, test/inductor/test_cpu_repro.py::CPUReproTests::test_index_propagation_issue_102065, test/inductor/test_cpu_repro.py::CPUReproTests::test_inplace_add_alpha, test/inductor/test_cpu_repro.py::CPUReproTests::test_int32_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_int64_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_int_div_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_invalid_index_of_empty_tensor, test/inductor/test_cpu_repro.py::CPUReproTests::test_ir_node_str, test/inductor/test_cpu_repro.py::CPUReproTests::test_issue122380, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_packed, test/inductor/test_cpu_repro.py::CPUReproTests::test_load_half, test/inductor/test_cpu_repro.py::CPUReproTests::test_load_inf_bf16, test/inductor/test_cpu_repro.py::CPUReproTests::test_load_same_bool_tensor_twice, test/inductor/test_cpu_repro.py::CPUReproTests::test_local_buffer_with_line_reuse, test/inductor/test_cpu_repro.py::CPUReproTests::test_logical_op_store_to_lowp_data_dtype, test/inductor/test_cpu_repro.py::CPUReproTests::test_lowp_fp_neg_abs, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_change_input_sizes_cpu_unbatched_False_input_size_2_hidden_size_5_num_layers_3_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_2_seq_len_3, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_masked_load_int64_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_max_reduction_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_maxpool2d_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_maxpool2d_with_pre_loop_collapse_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_memory_copy_with_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_meta_device, test/inductor/test_cpu_repro.py::CPUReproTests::test_mkl_linear, test/inductor/test_cpu_repro.py::CPUReproTests::test_multihead_attention_cpu, test/inductor/test_cpu_repro.py::CPUReproTests::test_new_vec_op_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_nn_param_assign, test/inductor/test_cpu_repro.py::CPUReproTests::test_nn_param_assign_wrapped, test/inductor/test_cpu_repro.py::CPUReproTests::test_non_contiguous_index_with_constant_stride, test/inductor/test_cpu_repro.py::CPUReproTests::test_non_contiguous_load_buf_quant_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_non_contiguous_reduction_store, test/inductor/test_cpu_repro.py::CPUReproTests::test_outer_looop_fusion_with_local_buf, test/inductor/test_cpu_repro.py::CPUReproTests::test_outer_loop_fusion_buffer_remove, test/inductor/test_cpu_repro.py::CPUReproTests::test_outer_mean_large_size, test/inductor/test_cpu_repro.py::CPUReproTests::test_pack_padded_sequence_lstm, test/inductor/test_cpu_repro.py::CPUReproTests::test_parallel_num_threads, test/inductor/test_cpu_repro.py::CPUReproTests::test_parallel_reduction_vectorization, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_channel_fake_quant_module_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_pow_cos, test/inductor/test_cpu_repro.py::CPUReproTests::test_reduce_with_masked, test/inductor/test_cpu_repro.py::CPUReproTests::test_reduction_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_redundant_to_node_elimination_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_relu_with_inf_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_require_stride_order_non_owning, test/inductor/test_cpu_repro.py::CPUReproTests::test_scatter_using_atomic_add, test/inductor/test_cpu_repro.py::CPUReproTests::test_scatter_using_atomic_add_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_select_tiliing_with_index_expr, test/inductor/test_cpu_repro.py::CPUReproTests::test_share_local_buffers_in_outer_loop_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_sigmoid_with_reduction, test/inductor/test_cpu_repro.py::CPUReproTests::test_sign_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_softmax_with_zero_dim, test/inductor/test_cpu_repro.py::CPUReproTests::test_store_reduction, test/inductor/test_cpu_repro.py::CPUReproTests::test_symbolic_shape_scalar_value_reduction, test/inductor/test_cpu_repro.py::CPUReproTests::test_tile2d_load_decomposed_dequant_add_relu_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_channels_last_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_dtype_bool_float, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_dtype_float_bool, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_uint8_rounding_method, test/inductor/test_cpu_repro.py::CPUReproTests::test_torch_linalg_qr_tuple_slice, test/inductor/test_cpu_repro.py::CPUReproTests::test_torch_logit, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_sum2d_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_vertical_sum_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_two_local_buffers_in_outer_loop_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint64_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint64_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint8_add, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint8_sub, test/inductor/test_cpu_repro.py::CPUReproTests::test_unrolled_bool_prod_vectorized, test/inductor/test_cpu_repro.py::CPUReproTests::test_unsupported_conv_transpose, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_bitwise, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_indirect_load_cse_cache, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_logical, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_randn, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_remainder, test/inductor/test_cpu_repro.py::CPUReproTests::test_view_dtype 2025-12-04T11:16:47.9343266Z 2025-12-04T11:16:47.9343569Z Finished inductor/test_cpu_repro 2/2 ... [2025-12-04 11:16:47.892082][7486.789650385], took 11.04min 2025-12-04T11:16:47.9344574Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_cpu_repro/inductor.test_cpu_repro-cff3315ca24db3df.xml 2025-12-04T11:16:49.1620570Z Uploading artifacts took 1.18 seconds 2025-12-04T11:16:49.1625548Z Running dynamo/test_ctx_manager 1/1 ... [2025-12-04 11:16:49.162224][7488.059791175] 2025-12-04T11:16:49.1626125Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:16:49.1630245Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_ctx_manager.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:16:49.162682] 2025-12-04T11:17:05.9616136Z 2025-12-04T11:17:05.9617899Z dynamo/test_ctx_manager 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_ctx_manager_1.1_66ea9ab40c2c7714_.log 2025-12-04T11:17:05.9655761Z Running 104 items in this shard: test/dynamo/test_ctx_manager.py::CtxManagerTests::test_311_resume_block_keyerror, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_311_resume_block_keyerror2, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_arguments_binding, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_cpu, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_cpu_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_cpu_graph_break_2, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_cpu_graph_break_inner_fn, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_decorator, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_device, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_float64, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_graph_break_method, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autocast_sdpa, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autograd_profiler, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_autograd_profiler_enabled, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_context_wrapping_grad_mode_decorator, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_context_wrapping_grad_mode_nested_function_decorator, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_context_wrapping_set_grad_enabled_nested_function, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_amp_autocast, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_device, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_across_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_created_outside_of_graph, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_method, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_method_create_stream_outside_of_compile, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_reconstruct, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_across_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_compared_with_constant, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_compared_with_stream, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_context_manager1, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_context_manager2, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_stream_method, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_disable_saved_tensors_hooks, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_disable_saved_tensors_hooks_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_disable_saved_tensors_hooks_prev_disabled, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_disable_saved_tensors_hooks_prev_disabled_nested, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_context_manager_CustomizedCtxManager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_context_manager_customized_ctx_manager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_context_manager_with_graph_break_CustomizedCtxManager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_context_manager_with_graph_break_customized_ctx_manager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_ctx_manager_with_graph_break_CustomizedCtxManagerWithGraphBreak, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_generic_ctx_manager_with_graph_break_customized_ctx_manager_with_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_grad_mode_guard, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_graph_break_inlining_autocast, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_graph_break_inlining_grad, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_inactive_context_graph_break_local, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_inactive_context_graph_break_local_nullctx, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_inactive_context_graph_break_local_nullctx2, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_inactive_context_graph_break_stack, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_inactive_context_graph_break_stack2, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_is_autocast_cpu_enabled, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_nested_generic_context_manager_CustomizedCtxManager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_nested_generic_context_manager_customized_ctx_manager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_nested_generic_context_manager_with_graph_break_CustomizedCtxManager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_nested_generic_context_manager_with_graph_break_customized_ctx_manager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_nested_grad_mode_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_no_grad, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_return_context_manager, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_return_context_manager_with_graph_break, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager1, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager2, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager3, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager_as_decorator, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager_kwargs, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_sdpa_kernel_ctx_manager_set_priority, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_store_attr_graph_break_key_error, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_torch_profiler, test/dynamo/test_ctx_manager.py::CtxManagerTests::test_torch_profiler_use_after_with_block, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_WITH_EXCEPT_START, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_advanced_contextmanager_as_argument, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_advanced_contextmanager_as_argument_error, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_0, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_1, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_global_0, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_global_1, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_nonlocal_0, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_change_parent_nonlocal_1, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextlib_nullcontext, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextlib_suppress_name_stderr, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextlib_suppress_name_stdout, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextlib_suppress_name_suppress, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextmanager_as_argument, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextmanager_as_argument_only___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_contextmanager_as_argument_only___exit__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_ctx_basic0, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_ctx_basic1, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_disable___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_disable___exit__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_disable_ctx_manager, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_disable_trace_contextmanager, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_dynamo_disable_ctx, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_globals_change_in_other_file, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_after___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_and_disable___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_before___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_before___enter___and_disable___exit__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_before_and_after___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_in_finally, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_inside___enter__, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_inside_ctx, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_inside_ctx_1, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_inside_ctx_2, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_graph_break_inside_ctx_with_side_effects, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_return_advanced_contextmanager, test/dynamo/test_ctx_manager.py::ContextlibContextManagerTests::test_return_new_contextmanager 2025-12-04T11:17:05.9691657Z 2025-12-04T11:17:05.9691935Z Finished dynamo/test_ctx_manager 1/1 ... [2025-12-04 11:17:05.961240][7504.858808371], took 0.28min 2025-12-04T11:17:05.9692937Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_ctx_manager/dynamo.test_ctx_manager-9d815792ffb0d789.xml 2025-12-04T11:17:06.0559059Z Running inductor/test_cpu_select_algorithm 1/1 ... [2025-12-04 11:17:06.055499][7504.953068417] 2025-12-04T11:17:06.0559746Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:17:06.0562507Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cpu_select_algorithm.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:17:06.055852] 2025-12-04T11:17:13.0336542Z 2025-12-04T11:17:13.0337570Z inductor/test_cpu_select_algorithm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cpu_select_algorithm_1.1_5984ab56237313fd_.log 2025-12-04T11:17:13.0338419Z Running 0 items in this shard: 2025-12-04T11:17:13.0338607Z 2025-12-04T11:17:13.0338932Z Finished inductor/test_cpu_select_algorithm 1/1 ... [2025-12-04 11:17:13.033305][7511.930874369], took 0.12min 2025-12-04T11:17:13.0374445Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_cpu_select_algorithm/inductor.test_cpu_select_algorithm-5324e07d08f84675.xml 2025-12-04T11:17:13.1049680Z Running inductor/test_torchinductor_strided_blocks 1/1 ... [2025-12-04 11:17:13.104574][7512.00214269] 2025-12-04T11:17:13.1050228Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:17:13.1053059Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_strided_blocks.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:17:13.104923] 2025-12-04T11:18:03.1180030Z 2025-12-04T11:18:03.1181335Z inductor/test_torchinductor_strided_blocks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_strided_blocks_1.1_b1fb6d4ac5e5d775_.log 2025-12-04T11:18:03.1364781Z Running 302 items in this shard: test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_2d_reduction_multi_kernel_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_2d_reduction_odd_shapes_view_size0_num_block_pointers_1_num_triton_kernels_1_reduction_op0_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_2d_reduction_odd_shapes_view_size1_num_block_pointers_3_num_triton_kernels_2_reduction_op1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_2d_reduction_odd_shapes_view_size2_num_block_pointers_1_num_triton_kernels_1_reduction_op2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_2d_reduction_odd_shapes_view_size3_num_block_pointers_1_num_triton_kernels_1_reduction_op3_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_2d_reduction_odd_shapes_view_size4_num_block_pointers_1_num_triton_kernels_1_reduction_op4_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_2d_reductions_mixed_indexing_reduction_op0_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_2d_reductions_mixed_indexing_reduction_op1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_2d_welford_reduction_size0_expected_num_block_pointers_1_expected_num_triton_kernels_1_expect_fallback_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_2d_welford_reduction_size1_expected_num_block_pointers_7_expected_num_triton_kernels_2_expect_fallback_False_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_3d_permute_tiling_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_boundary_check_block_multiple_False_ynumel_exceed_ygrid_size_False_include_z_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_boundary_check_block_multiple_True_ynumel_exceed_ygrid_size_False_include_z_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_boundary_check_block_multiple_True_ynumel_exceed_ygrid_size_True_include_z_False_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_broadcast_prefer_nd_tiling_False_x_size0_y_size0_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_broadcast_prefer_nd_tiling_False_x_size1_y_size1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_broadcast_prefer_nd_tiling_False_x_size2_y_size2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_broadcast_prefer_nd_tiling_False_x_size3_y_size3_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_broadcast_prefer_nd_tiling_True_x_size0_y_size0_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_broadcast_prefer_nd_tiling_True_x_size1_y_size1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_broadcast_prefer_nd_tiling_True_x_size2_y_size2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_broadcast_prefer_nd_tiling_True_x_size3_y_size3_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_broadcast_with_singleton_dims_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_complex_reshape_block_ptr_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_dynamic_shapes_pointwise_multiple_max_block_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_dynamic_shapes_pointwise_nd_tiling_False_num_block_pointers_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_dynamic_shapes_pointwise_nd_tiling_True_num_block_pointers_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_dynamic_shapes_reduction_with_tiling_False_num_block_pointers_0_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_dynamic_shapes_reduction_with_tiling_True_num_block_pointers_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_enable_tiled_reductions_tile_reductions_False_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_enable_tiled_reductions_tile_reductions_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_ensure_integral_dims_and_strides_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expand_broadcast_x_size0_y_size0_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expand_broadcast_x_size1_y_size1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expand_broadcast_x_size2_y_size2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expand_broadcast_x_size3_y_size3_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expand_broadcast_x_size4_y_size4_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expand_broadcast_x_size5_y_size5_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expand_broadcast_x_size6_y_size6_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expand_broadcast_x_size7_y_size7_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expand_broadcast_x_size8_y_size8_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expand_broadcast_x_size9_y_size9_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expand_clone_broadcast_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expected_num_block_pointers_expected_num_block_pointers_3_raises_False_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_expected_num_block_pointers_expected_num_block_pointers_9_raises_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_fused_2d_reduction_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_mixed_pointwise_reduction_view_size0_num_block_pointers_2_num_triton_kernels_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_mixed_pointwise_reduction_view_size1_num_block_pointers1_num_triton_kernels1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_mul_broadcast_multi_output_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_multiple_max_block_non_power_of_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_nd_tiling_odd_shapes_pointwise_full_size0_view_size0_num_block_pointers_3_num_tiles_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_nd_tiling_odd_shapes_pointwise_full_size1_view_size1_num_block_pointers_3_num_tiles_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_nd_tiling_odd_shapes_pointwise_full_size2_view_size2_num_block_pointers_3_num_tiles_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_nd_tiling_odd_shapes_pointwise_full_size3_view_size3_num_block_pointers_3_num_tiles_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_nd_tiling_odd_shapes_pointwise_full_size4_view_size4_num_block_pointers_3_num_tiles_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_nd_tiling_odd_shapes_pointwise_full_size5_view_size5_num_block_pointers_1_num_tiles_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_negative_strides_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_broadcast_nonzero_strides_prefer_nd_tiling_False_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_broadcast_nonzero_strides_prefer_nd_tiling_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_index_order_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_False_full_size0_view_size0_stride0_offset0_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_False_full_size1_view_size1_stride1_offset1_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_False_full_size2_view_size2_stride2_offset2_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_False_full_size3_view_size3_stride3_offset_10_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_False_full_size4_view_size4_stride4_offset4_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_False_full_size5_view_size5_stride5_offset5_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_False_full_size6_view_size6_stride6_offset6_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_False_full_size7_view_size7_stride7_offset7_require_block_ptr_False_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_False_full_size8_view_size8_stride8_offset8_require_block_ptr_False_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_False_full_size9_view_size9_stride9_offset9_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_True_full_size0_view_size0_stride0_offset0_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_True_full_size1_view_size1_stride1_offset1_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_True_full_size2_view_size2_stride2_offset2_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_True_full_size3_view_size3_stride3_offset_10_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_True_full_size4_view_size4_stride4_offset4_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_True_full_size5_view_size5_stride5_offset5_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_True_full_size6_view_size6_stride6_offset6_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_True_full_size7_view_size7_stride7_offset7_require_block_ptr_False_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_True_full_size8_view_size8_stride8_offset8_require_block_ptr_False_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_pointwise_prefer_nd_tiling_True_full_size9_view_size9_stride9_offset9_require_block_ptr_True_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_multiple_discontiguous_dims_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_False_view_size0_num_block_pointers_1_num_triton_kernels_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_False_view_size1_num_block_pointers_1_num_triton_kernels_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_False_view_size2_num_block_pointers_1_num_triton_kernels_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_False_view_size3_num_block_pointers3_num_triton_kernels_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_False_view_size4_num_block_pointers_3_num_triton_kernels_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_False_view_size5_num_block_pointers_2_num_triton_kernels_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_False_view_size6_num_block_pointers_3_num_triton_kernels_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_True_view_size0_num_block_pointers_1_num_triton_kernels_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_True_view_size1_num_block_pointers_1_num_triton_kernels_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_True_view_size2_num_block_pointers_1_num_triton_kernels_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_True_view_size3_num_block_pointers3_num_triton_kernels_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_True_view_size4_num_block_pointers_3_num_triton_kernels_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_True_view_size5_num_block_pointers_2_num_triton_kernels_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_reduction_prefer_nd_tiling_True_view_size6_num_block_pointers_3_num_triton_kernels_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_removed_buffers_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_unbacked_size_on_non_contig_dim_num_tile_candidates_1_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_unbacked_size_on_non_contig_dim_num_tile_candidates_2_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_welford_non_block_pointer_cpu, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_2d_reduction_multi_kernel_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_2d_reduction_odd_shapes_view_size0_num_block_pointers_1_num_triton_kernels_1_reduction_op0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_2d_reduction_odd_shapes_view_size1_num_block_pointers_3_num_triton_kernels_2_reduction_op1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_2d_reduction_odd_shapes_view_size2_num_block_pointers_1_num_triton_kernels_1_reduction_op2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_2d_reduction_odd_shapes_view_size3_num_block_pointers_1_num_triton_kernels_1_reduction_op3_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_2d_reduction_odd_shapes_view_size4_num_block_pointers_1_num_triton_kernels_1_reduction_op4_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_2d_reductions_mixed_indexing_reduction_op0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_2d_reductions_mixed_indexing_reduction_op1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_2d_welford_reduction_size0_expected_num_block_pointers_1_expected_num_triton_kernels_1_expect_fallback_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_2d_welford_reduction_size1_expected_num_block_pointers_7_expected_num_triton_kernels_2_expect_fallback_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_3d_permute_tiling_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_boundary_check_block_multiple_False_ynumel_exceed_ygrid_size_False_include_z_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_boundary_check_block_multiple_True_ynumel_exceed_ygrid_size_False_include_z_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_boundary_check_block_multiple_True_ynumel_exceed_ygrid_size_True_include_z_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_broadcast_prefer_nd_tiling_False_x_size0_y_size0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_broadcast_prefer_nd_tiling_False_x_size1_y_size1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_broadcast_prefer_nd_tiling_False_x_size2_y_size2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_broadcast_prefer_nd_tiling_False_x_size3_y_size3_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_broadcast_prefer_nd_tiling_True_x_size0_y_size0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_broadcast_prefer_nd_tiling_True_x_size1_y_size1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_broadcast_prefer_nd_tiling_True_x_size2_y_size2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_broadcast_prefer_nd_tiling_True_x_size3_y_size3_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_broadcast_with_singleton_dims_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_complex_reshape_block_ptr_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_dynamic_shapes_pointwise_multiple_max_block_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_dynamic_shapes_pointwise_nd_tiling_False_num_block_pointers_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_dynamic_shapes_pointwise_nd_tiling_True_num_block_pointers_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_dynamic_shapes_reduction_with_tiling_False_num_block_pointers_0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_dynamic_shapes_reduction_with_tiling_True_num_block_pointers_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_enable_tiled_reductions_tile_reductions_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_enable_tiled_reductions_tile_reductions_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_ensure_integral_dims_and_strides_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expand_broadcast_x_size0_y_size0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expand_broadcast_x_size1_y_size1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expand_broadcast_x_size2_y_size2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expand_broadcast_x_size3_y_size3_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expand_broadcast_x_size4_y_size4_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expand_broadcast_x_size5_y_size5_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expand_broadcast_x_size6_y_size6_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expand_broadcast_x_size7_y_size7_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expand_broadcast_x_size8_y_size8_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expand_broadcast_x_size9_y_size9_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expand_clone_broadcast_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expected_num_block_pointers_expected_num_block_pointers_3_raises_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_expected_num_block_pointers_expected_num_block_pointers_9_raises_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_fused_2d_reduction_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_mixed_pointwise_reduction_view_size0_num_block_pointers_2_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_mixed_pointwise_reduction_view_size1_num_block_pointers1_num_triton_kernels1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_mul_broadcast_multi_output_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_multiple_max_block_non_power_of_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_nd_tiling_odd_shapes_pointwise_full_size0_view_size0_num_block_pointers_3_num_tiles_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_nd_tiling_odd_shapes_pointwise_full_size1_view_size1_num_block_pointers_3_num_tiles_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_nd_tiling_odd_shapes_pointwise_full_size2_view_size2_num_block_pointers_3_num_tiles_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_nd_tiling_odd_shapes_pointwise_full_size3_view_size3_num_block_pointers_3_num_tiles_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_nd_tiling_odd_shapes_pointwise_full_size4_view_size4_num_block_pointers_3_num_tiles_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_nd_tiling_odd_shapes_pointwise_full_size5_view_size5_num_block_pointers_1_num_tiles_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_negative_strides_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_broadcast_nonzero_strides_prefer_nd_tiling_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_broadcast_nonzero_strides_prefer_nd_tiling_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_index_order_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_False_full_size0_view_size0_stride0_offset0_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_False_full_size1_view_size1_stride1_offset1_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_False_full_size2_view_size2_stride2_offset2_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_False_full_size3_view_size3_stride3_offset_10_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_False_full_size4_view_size4_stride4_offset4_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_False_full_size5_view_size5_stride5_offset5_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_False_full_size6_view_size6_stride6_offset6_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_False_full_size7_view_size7_stride7_offset7_require_block_ptr_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_False_full_size8_view_size8_stride8_offset8_require_block_ptr_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_False_full_size9_view_size9_stride9_offset9_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_True_full_size0_view_size0_stride0_offset0_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_True_full_size1_view_size1_stride1_offset1_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_True_full_size2_view_size2_stride2_offset2_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_True_full_size3_view_size3_stride3_offset_10_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_True_full_size4_view_size4_stride4_offset4_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_True_full_size5_view_size5_stride5_offset5_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_True_full_size6_view_size6_stride6_offset6_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_True_full_size7_view_size7_stride7_offset7_require_block_ptr_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_True_full_size8_view_size8_stride8_offset8_require_block_ptr_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_pointwise_prefer_nd_tiling_True_full_size9_view_size9_stride9_offset9_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_multiple_discontiguous_dims_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_False_view_size0_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_False_view_size1_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_False_view_size2_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_False_view_size3_num_block_pointers3_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_False_view_size4_num_block_pointers_3_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_False_view_size5_num_block_pointers_2_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_False_view_size6_num_block_pointers_3_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_True_view_size0_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_True_view_size1_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_True_view_size2_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_True_view_size3_num_block_pointers3_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_True_view_size4_num_block_pointers_3_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_True_view_size5_num_block_pointers_2_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_reduction_prefer_nd_tiling_True_view_size6_num_block_pointers_3_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_removed_buffers_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_unbacked_size_on_non_contig_dim_num_tile_candidates_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_unbacked_size_on_non_contig_dim_num_tile_candidates_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_welford_non_block_pointer_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_2d_reduction_multi_kernel_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_2d_reduction_odd_shapes_view_size0_num_block_pointers_1_num_triton_kernels_1_reduction_op0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_2d_reduction_odd_shapes_view_size1_num_block_pointers_3_num_triton_kernels_2_reduction_op1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_2d_reduction_odd_shapes_view_size2_num_block_pointers_1_num_triton_kernels_1_reduction_op2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_2d_reduction_odd_shapes_view_size3_num_block_pointers_1_num_triton_kernels_1_reduction_op3_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_2d_reduction_odd_shapes_view_size4_num_block_pointers_1_num_triton_kernels_1_reduction_op4_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_2d_reductions_mixed_indexing_reduction_op0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_2d_reductions_mixed_indexing_reduction_op1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_2d_welford_reduction_size0_expected_num_block_pointers_1_expected_num_triton_kernels_1_expect_fallback_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_2d_welford_reduction_size1_expected_num_block_pointers_7_expected_num_triton_kernels_2_expect_fallback_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_3d_permute_tiling_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_boundary_check_block_multiple_False_ynumel_exceed_ygrid_size_False_include_z_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_boundary_check_block_multiple_True_ynumel_exceed_ygrid_size_False_include_z_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_boundary_check_block_multiple_True_ynumel_exceed_ygrid_size_True_include_z_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_broadcast_prefer_nd_tiling_False_x_size0_y_size0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_broadcast_prefer_nd_tiling_False_x_size1_y_size1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_broadcast_prefer_nd_tiling_False_x_size2_y_size2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_broadcast_prefer_nd_tiling_False_x_size3_y_size3_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_broadcast_prefer_nd_tiling_True_x_size0_y_size0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_broadcast_prefer_nd_tiling_True_x_size1_y_size1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_broadcast_prefer_nd_tiling_True_x_size2_y_size2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_broadcast_prefer_nd_tiling_True_x_size3_y_size3_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_broadcast_with_singleton_dims_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_complex_reshape_block_ptr_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_dynamic_shapes_pointwise_multiple_max_block_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_dynamic_shapes_pointwise_nd_tiling_False_num_block_pointers_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_dynamic_shapes_pointwise_nd_tiling_True_num_block_pointers_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_dynamic_shapes_reduction_with_tiling_False_num_block_pointers_0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_dynamic_shapes_reduction_with_tiling_True_num_block_pointers_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_enable_tiled_reductions_tile_reductions_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_enable_tiled_reductions_tile_reductions_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_ensure_integral_dims_and_strides_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expand_broadcast_x_size0_y_size0_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expand_broadcast_x_size1_y_size1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expand_broadcast_x_size2_y_size2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expand_broadcast_x_size3_y_size3_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expand_broadcast_x_size4_y_size4_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expand_broadcast_x_size5_y_size5_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expand_broadcast_x_size6_y_size6_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expand_broadcast_x_size7_y_size7_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expand_broadcast_x_size8_y_size8_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expand_broadcast_x_size9_y_size9_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expand_clone_broadcast_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expected_num_block_pointers_expected_num_block_pointers_3_raises_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_expected_num_block_pointers_expected_num_block_pointers_9_raises_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_fused_2d_reduction_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_match_with_transpose_view_size0_permute_order0_num_tensor_descriptors_3_expect_transpose_False, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_match_with_transpose_view_size1_permute_order1_num_tensor_descriptors_3_expect_transpose_False, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_match_with_transpose_view_size2_permute_order2_num_tensor_descriptors_3_expect_transpose_True, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_match_with_transpose_view_size3_permute_order3_num_tensor_descriptors_3_expect_transpose_True, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_match_with_transpose_view_size4_permute_order4_num_tensor_descriptors_3_expect_transpose_True, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_mixed_pointwise_reduction_view_size0_num_block_pointers_2_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_mixed_pointwise_reduction_view_size1_num_block_pointers1_num_triton_kernels1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_mul_broadcast_multi_output_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_multiple_max_block_non_power_of_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_nd_tiling_odd_shapes_pointwise_full_size0_view_size0_num_block_pointers_3_num_tiles_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_nd_tiling_odd_shapes_pointwise_full_size1_view_size1_num_block_pointers_3_num_tiles_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_nd_tiling_odd_shapes_pointwise_full_size2_view_size2_num_block_pointers_3_num_tiles_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_nd_tiling_odd_shapes_pointwise_full_size3_view_size3_num_block_pointers_3_num_tiles_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_nd_tiling_odd_shapes_pointwise_full_size4_view_size4_num_block_pointers_3_num_tiles_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_nd_tiling_odd_shapes_pointwise_full_size5_view_size5_num_block_pointers_1_num_tiles_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_negative_strides_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_broadcast_nonzero_strides_prefer_nd_tiling_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_broadcast_nonzero_strides_prefer_nd_tiling_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_index_order_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_False_full_size0_view_size0_stride0_offset0_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_False_full_size1_view_size1_stride1_offset1_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_False_full_size2_view_size2_stride2_offset2_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_False_full_size3_view_size3_stride3_offset_10_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_False_full_size4_view_size4_stride4_offset4_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_False_full_size5_view_size5_stride5_offset5_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_False_full_size6_view_size6_stride6_offset6_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_False_full_size7_view_size7_stride7_offset7_require_block_ptr_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_False_full_size8_view_size8_stride8_offset8_require_block_ptr_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_False_full_size9_view_size9_stride9_offset9_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_True_full_size0_view_size0_stride0_offset0_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_True_full_size1_view_size1_stride1_offset1_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_True_full_size2_view_size2_stride2_offset2_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_True_full_size3_view_size3_stride3_offset_10_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_True_full_size4_view_size4_stride4_offset4_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_True_full_size5_view_size5_stride5_offset5_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_True_full_size6_view_size6_stride6_offset6_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_True_full_size7_view_size7_stride7_offset7_require_block_ptr_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_True_full_size8_view_size8_stride8_offset8_require_block_ptr_False_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_pointwise_prefer_nd_tiling_True_full_size9_view_size9_stride9_offset9_require_block_ptr_True_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_multiple_discontiguous_dims_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_False_view_size0_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_False_view_size1_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_False_view_size2_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_False_view_size3_num_block_pointers3_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_False_view_size4_num_block_pointers_3_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_False_view_size5_num_block_pointers_2_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_False_view_size6_num_block_pointers_3_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_True_view_size0_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_True_view_size1_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_True_view_size2_num_block_pointers_1_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_True_view_size3_num_block_pointers3_num_triton_kernels_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_True_view_size4_num_block_pointers_3_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_True_view_size5_num_block_pointers_2_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_reduction_prefer_nd_tiling_True_view_size6_num_block_pointers_3_num_triton_kernels_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_removed_buffers_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_unbacked_size_on_non_contig_dim_num_tile_candidates_1_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_unbacked_size_on_non_contig_dim_num_tile_candidates_2_cuda, test/inductor/test_torchinductor_strided_blocks.py::TritonTensorDescriptorTestCUDA::test_welford_non_block_pointer_cuda 2025-12-04T11:18:03.1542570Z 2025-12-04T11:18:03.1542998Z Finished inductor/test_torchinductor_strided_blocks 1/1 ... [2025-12-04 11:18:03.118668][7562.016235164], took 0.83min 2025-12-04T11:18:03.1544235Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_torchinductor_strided_blocks/inductor.test_torchinductor_strided_blocks-37bb13dfab68bc3d.xml 2025-12-04T11:18:03.2084007Z Running inductor/test_multi_kernel 1/1 ... [2025-12-04 11:18:03.207988][7562.105555927] 2025-12-04T11:18:03.2084482Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:18:03.2087154Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_multi_kernel.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:18:03.208357] 2025-12-04T11:18:48.7665943Z 2025-12-04T11:18:48.7667036Z inductor/test_multi_kernel 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_multi_kernel_1.1_bec4b3df976b796a_.log 2025-12-04T11:18:48.7674244Z Running 19 items in this shard: test/inductor/test_multi_kernel.py::MultiKernelTest::test_batchnorm_training, test/inductor/test_multi_kernel.py::MultiKernelTest::test_inplace_update, test/inductor/test_multi_kernel.py::MultiKernelTest::test_layernorm, test/inductor/test_multi_kernel.py::MultiKernelTest::test_pass_same_arg_multi_times, test/inductor/test_multi_kernel.py::MultiKernelTest::test_reduction_scratch_buffer, test/inductor/test_multi_kernel.py::MultiKernelTest::test_reduction_scratch_buffer_cpp_wrapper, test/inductor/test_multi_kernel.py::MultiKernelTest::test_reduction_scratch_buffer_cpp_wrapper_non_persistent_reduction, test/inductor/test_multi_kernel.py::MultiKernelTest::test_reduction_scratch_buffer_cpp_wrapper_persistent_reduction, test/inductor/test_multi_kernel.py::MultiKernelTest::test_softmax, test/inductor/test_multi_kernel.py::MultiKernelTest::test_softmax_cpp_wrapper, test/inductor/test_multi_kernel.py::MultiKernelTest::test_softmax_force_non_persistent_reduction_force_kernel_0, test/inductor/test_multi_kernel.py::MultiKernelTest::test_softmax_force_non_persistent_reduction_force_kernel_1, test/inductor/test_multi_kernel.py::MultiKernelTest::test_softmax_warn_mixed_layout, test/inductor/test_multi_kernel.py::MultiKernelTest::test_sort_disables_multi_kernel, test/inductor/test_multi_kernel.py::MultiKernelTest::test_split_scan, test/inductor/test_multi_kernel.py::MultiKernelTest::test_transformer_snippet, test/inductor/test_multi_kernel.py::MultiKernelTest::test_transformer_snippet_with_fallback_random, test/inductor/test_multi_kernel.py::MultiKernelTest::test_triton_gemm, test/inductor/test_multi_kernel.py::MultiKernelTest::test_triton_relu_fused_gemm 2025-12-04T11:18:48.7680774Z 2025-12-04T11:18:48.7681076Z Finished inductor/test_multi_kernel 1/1 ... [2025-12-04 11:18:48.766282][7607.663851075], took 0.76min 2025-12-04T11:18:48.7708623Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_multi_kernel/inductor.test_multi_kernel-255fcbe2f02e1ae3.xml 2025-12-04T11:18:48.8433062Z Running inductor/test_pad_mm 1/1 ... [2025-12-04 11:18:48.842916][7607.740485011] 2025-12-04T11:18:48.8433523Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:18:48.8436285Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_pad_mm.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:18:48.843281] 2025-12-04T11:19:35.9546888Z 2025-12-04T11:19:35.9550643Z inductor/test_pad_mm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_pad_mm_1.1_8ac647038af06d96_.log 2025-12-04T11:19:35.9555492Z Running 19 items in this shard: test/inductor/test_pad_mm.py::PadMMTest::test_cat_pad_mm_dyn_m, test/inductor/test_pad_mm.py::PadMMTest::test_exclude_cat_padding, test/inductor/test_pad_mm.py::PadMMTest::test_exclude_padding, test/inductor/test_pad_mm.py::PadMMTest::test_no_autocast_in_pad_bmm_joint_graph_pass, test/inductor/test_pad_mm.py::PadMMTest::test_original_aten_preserved_pad_mm, test/inductor/test_pad_mm.py::PadMMTest::test_pad_addmm_2d_bias, test/inductor/test_pad_mm.py::PadMMTest::test_pad_addmm_dyn_m, test/inductor/test_pad_mm.py::PadMMTest::test_pad_addmm_dyn_mn, test/inductor/test_pad_mm.py::PadMMTest::test_pad_batch, test/inductor/test_pad_mm.py::PadMMTest::test_pad_bmm_dyn_b, test/inductor/test_pad_mm.py::PadMMTest::test_pad_bmm_dyn_bm, test/inductor/test_pad_mm.py::PadMMTest::test_pad_bmm_dyn_k, test/inductor/test_pad_mm.py::PadMMTest::test_pad_mm_bf16, test/inductor/test_pad_mm.py::PadMMTest::test_pad_mm_dyn_k, test/inductor/test_pad_mm.py::PadMMTest::test_pad_mm_dyn_m, test/inductor/test_pad_mm.py::PadMMTest::test_pad_mm_dyn_mnk, test/inductor/test_pad_mm.py::PadMMTest::test_pad_mm_dyn_n, test/inductor/test_pad_mm.py::PadMMTest::test_pad_single_cat, test/inductor/test_pad_mm.py::PadMMTest::test_zero_dim 2025-12-04T11:19:35.9559891Z 2025-12-04T11:19:35.9560154Z Finished inductor/test_pad_mm 1/1 ... [2025-12-04 11:19:35.954185][7654.851754089], took 0.79min 2025-12-04T11:19:35.9589478Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_pad_mm/inductor.test_pad_mm-1824b4b320322926.xml 2025-12-04T11:19:36.0289482Z Running test_sparse_semi_structured 1/1 ... [2025-12-04 11:19:36.028586][7654.926154737] 2025-12-04T11:19:36.0289967Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:19:36.0292417Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse_semi_structured.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:19:36.028905] 2025-12-04T11:20:08.2998364Z 2025-12-04T11:20:08.2999299Z test_sparse_semi_structured 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_semi_structured_1.1_fe0272a596a3ae57_.log 2025-12-04T11:20:08.3109355Z Running 218 items in this shard: test/test_sparse_semi_structured.py::SparseSemiStructuredTensorCompileTest::test_mlp_contiguous_relu_compile_cusparselt, test/test_sparse_semi_structured.py::SparseSemiStructuredTensorCompileTest::test_mlp_contiguous_relu_compile_cutlass, test/test_sparse_semi_structured.py::SparseSemiStructuredTensorCompileTest::test_sp24_compile, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_indices_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_indices_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape0_inference_mode_False_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape0_inference_mode_False_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape0_inference_mode_True_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape0_inference_mode_True_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape1_inference_mode_False_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape1_inference_mode_False_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape1_inference_mode_True_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape1_inference_mode_True_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape2_inference_mode_False_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape2_inference_mode_False_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape2_inference_mode_True_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape2_inference_mode_True_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape3_inference_mode_False_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape3_inference_mode_False_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape3_inference_mode_True_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape3_inference_mode_True_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape0_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape0_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape1_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape1_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape2_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape2_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape3_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape3_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape0_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape0_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape0_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape1_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape1_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape1_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape2_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape2_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape2_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape0_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape0_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape0_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape1_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape1_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape1_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape2_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape2_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape2_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape0_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape0_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape0_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape1_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape1_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape1_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape2_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape2_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape2_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape0_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape0_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape0_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape1_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape1_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape1_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape2_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape2_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape2_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dim_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dim_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_complex128, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_complex64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_float32, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_float64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_int16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_int32, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_int64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_uint8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_complex128, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_complex64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_float32, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_float64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_int16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_int32, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_int64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_uint8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_values_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_values_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_all_patterns_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_all_patterns_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_all_patterns_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_linear_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_linear_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_linear_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_sparse_semi_structured_ops_cutlass_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_sparse_semi_structured_ops_cutlass_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_sparse_semi_structured_ops_cutlass_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_gemm_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_gemm_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_edge_case1_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_edge_case1_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_id_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_id_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_meta_correctness_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_meta_correctness_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_meta_correctness_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_meta_correctness_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_prune_dense_static_sort_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_prune_dense_static_sort_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pruning_algo_largest_abs_values_greedy_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pruning_algo_largest_abs_values_greedy_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pruning_algo_largest_abs_values_greedy_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pruning_algo_largest_abs_values_greedy_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_apply_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_apply_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_apply_dense_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_apply_dense_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_matmuls_bmm_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_matmuls_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_matmuls_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_matmuls_mat_vec_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_compile_autotune_bfloat16_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_compile_autotune_float16_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_compile_autotune_int32_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_mixed_dtype_bfloat16_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_mixed_dtype_float16_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_mixed_dtype_int32_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_mixed_dtype_bfloat16_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_mixed_dtype_float16_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_mixed_dtype_int32_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_search_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_search_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_search_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_csrc_cslt_sparse_mm_search_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_csrc_cslt_sparse_mm_search_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_csrc_cslt_sparse_mm_search_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cusparselt_backend_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_sparse_fp8fp8_mm_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_sparse_semi_structured_scaled_mm_bfloat16_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_sparse_semi_structured_scaled_mm_float16_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_sparse_semi_structured_scaled_mm_float32_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_sparse_semi_structured_scaled_mm_fp8_cuda 2025-12-04T11:20:08.3218790Z 2025-12-04T11:20:08.3219088Z Finished test_sparse_semi_structured 1/1 ... [2025-12-04 11:20:08.299909][7687.197477486], took 0.54min 2025-12-04T11:20:08.3220125Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_sparse_semi_structured/test_sparse_semi_structured-77ff603b5ea79d9b.xml 2025-12-04T11:20:08.4079544Z Running inductor/test_gpu_cpp_wrapper 1/1 ... [2025-12-04 11:20:08.407597][7687.305164273] 2025-12-04T11:20:08.4080061Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:20:08.4083004Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_gpu_cpp_wrapper.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:20:08.407953] 2025-12-04T11:29:54.3205279Z 2025-12-04T11:29:54.3215518Z inductor/test_gpu_cpp_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_gpu_cpp_wrapper_1.1_a0299d1d2a0dc96c_.log 2025-12-04T11:29:54.3343660Z Running 295 items in this shard: test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_add_complex4_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_add_complex_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_adding_tensor_offsets_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_addmm_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_aoti_debug_printer_works_on_constants, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_as_strided_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_batch_norm_2d_2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bernoulli1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bitwise_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bmm1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bmm2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_buffer_use_after_remove_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_cat_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_cat_slice_cat_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_consecutive_split_cumprod_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_conv_backward_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_convolution1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_custom_op_1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_custom_op_2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_custom_op_3_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_fusion_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dynamic_shapes_persistent_reduction_mixed_x_dim_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_embedding_bag_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_enable_dynamic_shapes_cpp_wrapper_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_fft_real_input_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_fft_real_input_real_output_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_foreach_cpp_wrapper_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_index_put_deterministic_fallback_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_index_tensor_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_inductor_layout_optimization_input_mutations_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_insignificant_strides_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_layer_norm_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_linear1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_linear2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_linear_relu_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_mm_plus_mm2_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_mm_plus_mm3_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_mm_views_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_multi_device_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_multi_threading_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_non_tensor_args_wrapped_on_cpu, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_pointwise_hermite_polynomial_h_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_pointwise_hermite_polynomial_he_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_pow3_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_profiler_mark_wrapper_call_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_randint_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_reduction1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_relu_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_repeat_interleave_2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_roi_align_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_scalar_input_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_scaled_dot_product_attention_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_scaled_dot_product_efficient_attention_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_silu_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_sort_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_sum_dtype_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_sum_int_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_transpose_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_add_complex4_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_add_complex_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_adding_tensor_offsets_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_addmm_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_annotation_training, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_as_strided_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_batch_norm_2d_2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bernoulli1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bitwise_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bmm1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bmm2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_buffer_use_after_remove_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_cat_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_cat_slice_cat_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_consecutive_split_cumprod_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_conv_backward_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_convolution1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_custom_op_1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_custom_op_2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_custom_op_3_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_fusion_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dynamic_shapes_persistent_reduction_mixed_x_dim_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_embedding_bag_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_enable_dynamic_shapes_cpp_wrapper_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_fft_real_input_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_fft_real_input_real_output_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_foreach_cpp_wrapper_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_index_put_deterministic_fallback_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_index_tensor_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_inductor_layout_optimization_input_mutations_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_insignificant_strides_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_layer_norm_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_linear1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_linear2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_linear_relu_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_mm_plus_mm2_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_mm_plus_mm3_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_mm_views_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_multi_device_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_multi_threading_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_pointwise_hermite_polynomial_h_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_pointwise_hermite_polynomial_he_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_pow3_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_profiler_mark_wrapper_call_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_randint_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_reduction1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_relu_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_repeat_interleave_2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_roi_align_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_scalar_input_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_scaled_dot_product_attention_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_scaled_dot_product_efficient_attention_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_silu_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_sort_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_sum_dtype_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_sum_int_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_transpose_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_uint8_cuda_dynamic_shapes_gpu_wrapper 2025-12-04T11:29:54.3469939Z 2025-12-04T11:29:54.3470250Z Finished inductor/test_gpu_cpp_wrapper 1/1 ... [2025-12-04 11:29:54.320645][8273.218213873], took 9.77min 2025-12-04T11:29:54.3471329Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_gpu_cpp_wrapper/inductor.test_gpu_cpp_wrapper-b432c653bfe7053d.xml 2025-12-04T11:29:54.4199584Z Running inductor/test_move_constructors_to_gpu 1/1 ... [2025-12-04 11:29:54.419594][8273.317161542] 2025-12-04T11:29:54.4200108Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:29:54.4202906Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_move_constructors_to_gpu.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:29:54.419925] 2025-12-04T11:30:07.9118047Z 2025-12-04T11:30:07.9119008Z inductor/test_move_constructors_to_gpu 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_move_constructors_to_gpu_1.1_7d3bfc88dc865b70_.log 2025-12-04T11:30:07.9122724Z Running 7 items in this shard: test/inductor/test_move_constructors_to_gpu.py::TestMoveConstructorsToGpu::test_multi_gpu, test/inductor/test_move_constructors_to_gpu.py::TestMoveConstructorsToGpu::test_multiple_constructors, test/inductor/test_move_constructors_to_gpu.py::TestMoveConstructorsToGpu::test_no_gpu, test/inductor/test_move_constructors_to_gpu.py::TestMoveConstructorsToGpu::test_non_convertable_op_failure, test/inductor/test_move_constructors_to_gpu.py::TestMoveConstructorsToGpu::test_output_failure, test/inductor/test_move_constructors_to_gpu.py::TestMoveConstructorsToGpu::test_sets_equiv, test/inductor/test_move_constructors_to_gpu.py::TestMoveConstructorsToGpu::test_simple 2025-12-04T11:30:07.9125701Z 2025-12-04T11:30:07.9126049Z Finished inductor/test_move_constructors_to_gpu 1/1 ... [2025-12-04 11:30:07.911454][8286.809023405], took 0.22min 2025-12-04T11:30:07.9165643Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_move_constructors_to_gpu/inductor.test_move_constructors_to_gpu-350dc6b168b259f1.xml 2025-12-04T11:30:07.9954014Z Running dynamo/test_list 1/1 ... [2025-12-04 11:30:07.995096][8286.892664824] 2025-12-04T11:30:07.9954455Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:30:07.9958834Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_list.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:07.995425] 2025-12-04T11:30:14.5235008Z 2025-12-04T11:30:14.5236212Z dynamo/test_list 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_list_1.1_d6d47e0a20ceaf24_.log 2025-12-04T11:30:14.5246344Z Running 41 items in this shard: test/dynamo/test_list.py::TupleTests::test___contains__, test/dynamo/test_list.py::TupleTests::test___getitem__, test/dynamo/test_list.py::TupleTests::test___iter__, test/dynamo/test_list.py::TupleTests::test_binop_add, test/dynamo/test_list.py::TupleTests::test_binop_imul, test/dynamo/test_list.py::TupleTests::test_cmp_eq, test/dynamo/test_list.py::TupleTests::test_cmp_greater_than, test/dynamo/test_list.py::TupleTests::test_cmp_greater_than_or_equal, test/dynamo/test_list.py::TupleTests::test_cmp_less_than, test/dynamo/test_list.py::TupleTests::test_cmp_less_than_or_equal, test/dynamo/test_list.py::TupleTests::test_cmp_ne, test/dynamo/test_list.py::TupleTests::test_count, test/dynamo/test_list.py::TupleTests::test_index, test/dynamo/test_list.py::ListTests::test___contains__, test/dynamo/test_list.py::ListTests::test___delitem__, test/dynamo/test_list.py::ListTests::test___getitem__, test/dynamo/test_list.py::ListTests::test___iter__, test/dynamo/test_list.py::ListTests::test___setitem__, test/dynamo/test_list.py::ListTests::test_append, test/dynamo/test_list.py::ListTests::test_binop_add, test/dynamo/test_list.py::ListTests::test_binop_delitem_global_list, test/dynamo/test_list.py::ListTests::test_binop_iadd, test/dynamo/test_list.py::ListTests::test_binop_iadd_global_list, test/dynamo/test_list.py::ListTests::test_binop_imul, test/dynamo/test_list.py::ListTests::test_binop_imul_global_list, test/dynamo/test_list.py::ListTests::test_clear, test/dynamo/test_list.py::ListTests::test_cmp_eq, test/dynamo/test_list.py::ListTests::test_cmp_greater_than, test/dynamo/test_list.py::ListTests::test_cmp_greater_than_or_equal, test/dynamo/test_list.py::ListTests::test_cmp_less_than, test/dynamo/test_list.py::ListTests::test_cmp_less_than_or_equal, test/dynamo/test_list.py::ListTests::test_cmp_ne, test/dynamo/test_list.py::ListTests::test_copy, test/dynamo/test_list.py::ListTests::test_count, test/dynamo/test_list.py::ListTests::test_extend, test/dynamo/test_list.py::ListTests::test_index, test/dynamo/test_list.py::ListTests::test_insert, test/dynamo/test_list.py::ListTests::test_pop, test/dynamo/test_list.py::ListTests::test_remove, test/dynamo/test_list.py::ListTests::test_reverse, test/dynamo/test_list.py::ListTests::test_sort 2025-12-04T11:30:14.5254580Z 2025-12-04T11:30:14.5254832Z Finished dynamo/test_list 1/1 ... [2025-12-04 11:30:14.523178][8293.420746787], took 0.11min 2025-12-04T11:30:14.5287055Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_list/dynamo.test_list-89d839f900e86b8b.xml 2025-12-04T11:30:14.6118377Z Running dynamo/test_resume 1/1 ... [2025-12-04 11:30:14.611403][8293.508971156] 2025-12-04T11:30:14.6118990Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:30:14.6122154Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_resume.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:14.611745] 2025-12-04T11:30:18.9851228Z 2025-12-04T11:30:18.9852004Z dynamo/test_resume 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_resume_1.1_4b411187985d50e8_.log 2025-12-04T11:30:18.9852924Z Running 1 items in this shard: test/dynamo/test_resume.py::ResumeFunctionTests::test_freevars 2025-12-04T11:30:18.9853336Z 2025-12-04T11:30:18.9853596Z Finished dynamo/test_resume 1/1 ... [2025-12-04 11:30:18.984684][8297.882249436], took 0.07min 2025-12-04T11:30:18.9902508Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_resume/dynamo.test_resume-966072d129aef6be.xml 2025-12-04T11:30:19.0234915Z Running inductor/test_augmented_graph_helper 1/1 ... [2025-12-04 11:30:19.023135][8297.920704385] 2025-12-04T11:30:19.0235466Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:30:19.0239109Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_augmented_graph_helper.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:19.023454] 2025-12-04T11:30:22.7462224Z 2025-12-04T11:30:22.7463456Z inductor/test_augmented_graph_helper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_augmented_graph_helper_1.1_d973a35117a850a6_.log 2025-12-04T11:30:22.7472133Z Running 20 items in this shard: test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_cycle_through_merge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_cycle_with_extra_deps, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_extra_deps_with_merge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_has_path_direct, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_has_path_through_merge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_has_path_transitive, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_has_path_with_extra_deps, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_initial_state, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_merged_deps_collection, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_multiple_merge_unmerge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_no_cycle_in_dag, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_simple_cycle_detection, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_simple_merge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_transfer_multiple_merge_sets_with_chain, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_transfer_preserves_external_deps, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_transfer_with_cross_deps, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_transfer_with_merge_sets, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_transitive_merge, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_unmerge_from_singleton, test/inductor/test_augmented_graph_helper.py::TestAugmentedGraphHelper::test_unmerge_node 2025-12-04T11:30:22.7480219Z 2025-12-04T11:30:22.7480646Z Finished inductor/test_augmented_graph_helper 1/1 ... [2025-12-04 11:30:22.745941][8301.643507675], took 0.06min 2025-12-04T11:30:22.7516312Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_augmented_graph_helper/inductor.test_augmented_graph_helper-9629e19c1d2aae8f.xml 2025-12-04T11:30:22.7813010Z Running dynamo/test_deviceguard 1/1 ... [2025-12-04 11:30:22.780965][8301.678535309] 2025-12-04T11:30:22.7813496Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:30:22.7816478Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_deviceguard.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:22.781287] 2025-12-04T11:30:26.6534939Z 2025-12-04T11:30:26.6535924Z dynamo/test_deviceguard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_deviceguard_1.1_a09319d54f6cd115_.log 2025-12-04T11:30:26.6537849Z Running 4 items in this shard: test/dynamo/test_deviceguard.py::TestDeviceGuard::test_device_guard, test/dynamo/test_deviceguard.py::TestDeviceGuard::test_device_guard_no_index, test/dynamo/test_deviceguard.py::TestCUDADeviceGuard::test_device_guard, test/dynamo/test_deviceguard.py::TestCUDADeviceGuard::test_device_guard_no_index 2025-12-04T11:30:26.6539186Z 2025-12-04T11:30:26.6539472Z Finished dynamo/test_deviceguard 1/1 ... [2025-12-04 11:30:26.652986][8305.550554896], took 0.06min 2025-12-04T11:30:26.6588417Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_deviceguard/dynamo.test_deviceguard-44dacdbd0b626b15.xml 2025-12-04T11:30:26.7260441Z Running dynamo/test_sources 1/1 ... [2025-12-04 11:30:26.725711][8305.623280671] 2025-12-04T11:30:26.7260886Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:30:26.7264354Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_sources.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:26.726066] 2025-12-04T11:30:31.0998116Z 2025-12-04T11:30:31.0999459Z dynamo/test_sources 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_sources_1.1_ec074f68baeeb3a5_.log 2025-12-04T11:30:31.1001687Z Running 3 items in this shard: test/dynamo/test_sources.py::SourceTests::test_is_local, test/dynamo/test_sources.py::SourceTests::test_property_closure, test/dynamo/test_sources.py::SourceTests::test_supported_nodes 2025-12-04T11:30:31.1003000Z 2025-12-04T11:30:31.1003410Z Finished dynamo/test_sources 1/1 ... [2025-12-04 11:30:31.099458][8309.997026716], took 0.07min 2025-12-04T11:30:31.1057083Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_sources/dynamo.test_sources-c8e9ccd7fdd84d86.xml 2025-12-04T11:30:31.1344331Z Running dynamo/test_backward_higher_order_ops 1/1 ... [2025-12-04 11:30:31.134038][8310.031608297] 2025-12-04T11:30:31.1345112Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:30:31.1348102Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_backward_higher_order_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:31.134405] 2025-12-04T11:30:42.9209341Z 2025-12-04T11:30:42.9210459Z dynamo/test_backward_higher_order_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_backward_higher_order_ops_1.1_65963bb6de763503_.log 2025-12-04T11:30:42.9215051Z Running 7 items in this shard: test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_in_eager, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_in_pt2, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_in_pt2_compiled_autograd, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_in_pt2_compiled_autograd_graph_breaks, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_in_pt2_compiled_autograd_side_effect, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_make_bw, test/dynamo/test_backward_higher_order_ops.py::BackwardHigherOrderOpTests::test_invoke_make_fx_forward_contrived 2025-12-04T11:30:42.9218339Z 2025-12-04T11:30:42.9218669Z Finished dynamo/test_backward_higher_order_ops 1/1 ... [2025-12-04 11:30:42.920616][8321.818182947], took 0.20min 2025-12-04T11:30:42.9271668Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_backward_higher_order_ops/dynamo.test_backward_higher_order_ops-bea32613ba34e2b9.xml 2025-12-04T11:30:43.0384405Z Running dynamo/test_optimizers 1/1 ... [2025-12-04 11:30:43.038010][8321.93557774] 2025-12-04T11:30:43.0384886Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:30:43.0387460Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_optimizers.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:43.038352] 2025-12-04T11:30:48.6142267Z 2025-12-04T11:30:48.6143267Z dynamo/test_optimizers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_optimizers_1.1_6a53efce4d52876b_.log 2025-12-04T11:30:48.6145082Z Running 3 items in this shard: test/dynamo/test_optimizers.py::End2EndTests::test_init_group, test/dynamo/test_optimizers.py::End2EndTests::test_optimizing_over_tensor_with_requires_grad, test/dynamo/test_optimizers.py::End2EndTests::test_state_dict 2025-12-04T11:30:48.6146124Z 2025-12-04T11:30:48.6146406Z Finished dynamo/test_optimizers 1/1 ... [2025-12-04 11:30:48.613873][8327.511441272], took 0.09min 2025-12-04T11:30:48.6202881Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_optimizers/dynamo.test_optimizers-523ecb2595aa9e7c.xml 2025-12-04T11:30:48.6501500Z Running inductor/test_custom_partitioner_fn 1/1 ... [2025-12-04 11:30:48.649776][8327.547345606] 2025-12-04T11:30:48.6502152Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:30:48.6504752Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_custom_partitioner_fn.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:48.650099] 2025-12-04T11:30:59.1845865Z 2025-12-04T11:30:59.1847002Z inductor/test_custom_partitioner_fn 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_custom_partitioner_fn_1.1_76a9f83dca8c1ec8_.log 2025-12-04T11:30:59.1848244Z Running 1 items in this shard: test/inductor/test_custom_partitioner_fn.py::TestCustomPartitionerFn::test_custom_partitioner_fn 2025-12-04T11:30:59.1848785Z 2025-12-04T11:30:59.1849133Z Finished inductor/test_custom_partitioner_fn 1/1 ... [2025-12-04 11:30:59.184251][8338.081820173], took 0.18min 2025-12-04T11:30:59.1907100Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_custom_partitioner_fn/inductor.test_custom_partitioner_fn-096ee008790c4efc.xml 2025-12-04T11:30:59.2664162Z Running dynamo/test_debug_utils 1/1 ... [2025-12-04 11:30:59.266014][8338.163583594] 2025-12-04T11:30:59.2664760Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:30:59.2667436Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_debug_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:59.266380] 2025-12-04T11:31:03.6889422Z 2025-12-04T11:31:03.6891288Z dynamo/test_debug_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_debug_utils_1.1_6f3ba3f6dc151c68_.log 2025-12-04T11:31:03.6894419Z Running 4 items in this shard: test/dynamo/test_debug_utils.py::TestDebugUtilsCUDA::test_cast_model_to_fp64_dtype_args_cuda, test/dynamo/test_debug_utils.py::TestDebugUtilsCUDA::test_generate_env_vars_string_cuda, test/dynamo/test_debug_utils.py::TestDebugUtilsDeviceCUDA::test_aot_graph_parser_cuda, test/dynamo/test_debug_utils.py::TestDebugUtilsDeviceCUDA::test_sym_aot_graph_parser_cuda 2025-12-04T11:31:03.6895951Z 2025-12-04T11:31:03.6896247Z Finished dynamo/test_debug_utils 1/1 ... [2025-12-04 11:31:03.688562][8342.586128082], took 0.07min 2025-12-04T11:31:03.6957428Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_debug_utils/dynamo.test_debug_utils-7d0242e9dff3a28b.xml 2025-12-04T11:31:03.7283760Z Running dynamo/test_base_hop 1/1 ... [2025-12-04 11:31:03.727979][8342.625548827] 2025-12-04T11:31:03.7284356Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:31:03.7287027Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_base_hop.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:31:03.728292] 2025-12-04T11:31:09.3532803Z 2025-12-04T11:31:09.3533707Z dynamo/test_base_hop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_base_hop_1.1_047db26b5c066b2b_.log 2025-12-04T11:31:09.3537346Z Running 11 items in this shard: test/dynamo/test_base_hop.py::BaseHOPTest::test_aliasing_mutation_error, test/dynamo/test_base_hop.py::BaseHOPTest::test_aot_eager, test/dynamo/test_base_hop.py::BaseHOPTest::test_auto_functionalize, test/dynamo/test_base_hop.py::BaseHOPTest::test_dynamo, test/dynamo/test_base_hop.py::BaseHOPTest::test_eager_call, test/dynamo/test_base_hop.py::BaseHOPTest::test_int_input, test/dynamo/test_base_hop.py::BaseHOPTest::test_none_input, test/dynamo/test_base_hop.py::BaseHOPTest::test_schema_gen_pytree_in_out, test/dynamo/test_base_hop.py::BaseHOPTest::test_schema_gen_pytree_in_out_with_mutation, test/dynamo/test_base_hop.py::BaseHOPTest::test_schema_gen_single_return, test/dynamo/test_base_hop.py::BaseHOPTest::test_schema_gen_single_return_with_mutation 2025-12-04T11:31:09.3540220Z 2025-12-04T11:31:09.3540495Z Finished dynamo/test_base_hop 1/1 ... [2025-12-04 11:31:09.352968][8348.250536646], took 0.09min 2025-12-04T11:31:09.3596951Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_base_hop/dynamo.test_base_hop-39440f05cd7c387c.xml 2025-12-04T11:31:09.3902453Z Running dynamo/test_export 1/1 ... [2025-12-04 11:31:09.389915][8348.287485542] 2025-12-04T11:31:09.3902972Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:31:09.3906166Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_export.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:31:09.390219] 2025-12-04T11:31:38.5578338Z 2025-12-04T11:31:38.5579357Z dynamo/test_export 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_export_1.1_0d51c806ddeb10f0_.log 2025-12-04T11:31:38.5633121Z Running 186 items in this shard: test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_attr, test/dynamo/test_export.py::ExportTests::test_access_class_method_from_user_class_builtin, test/dynamo/test_export.py::ExportTests::test_byte_tensor_does_not_crash, test/dynamo/test_export.py::ExportTests::test_capture_symbolic_tracing_simple_within_fake_mode, test/dynamo/test_export.py::ExportTests::test_capture_symbolic_tracing_within_fake_mode, test/dynamo/test_export.py::ExportTests::test_cond_free_variables_overlapping, test/dynamo/test_export.py::ExportTests::test_cond_op_param_buffer_lifted, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_branch_args_mismatch, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_branch_return_multiple_tensors, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_branch_return_non_tensor, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_mismatch_return_length, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_mismatch_return_tensor_meta, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_missing_args, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_non_list_operands, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_non_tensor_operands, test/dynamo/test_export.py::ExportTests::test_cond_raise_user_error_on_unsupported_pred, test/dynamo/test_export.py::ExportTests::test_cond_supported_pred_types, test/dynamo/test_export.py::ExportTests::test_constraint_violation_error_messages, test/dynamo/test_export.py::ExportTests::test_dataclass_input_output, test/dynamo/test_export.py::ExportTests::test_dict_return, test/dynamo/test_export.py::ExportTests::test_dict_return_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes, test/dynamo/test_export.py::ExportTests::test_dupes_2, test/dynamo/test_export.py::ExportTests::test_dupes_2_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_reorder_with_non_tensor_arg, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_reorder_with_non_tensor_arg_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_with_non_tensor_arg, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_with_non_tensor_arg_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_with_non_tensor_output, test/dynamo/test_export.py::ExportTests::test_dupes_and_bypass_with_non_tensor_output_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dupes_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_dynamic_slicing, test/dynamo/test_export.py::ExportTests::test_dynamic_slicing_simple, test/dynamo/test_export.py::ExportTests::test_dynamo_enum_in_tuple, test/dynamo/test_export.py::ExportTests::test_dynamo_list_index, test/dynamo/test_export.py::ExportTests::test_empty, test/dynamo/test_export.py::ExportTests::test_enforce_equalities, test/dynamo/test_export.py::ExportTests::test_export, test/dynamo/test_export.py::ExportTests::test_export_compare_optimize_with_make_fx, test/dynamo/test_export.py::ExportTests::test_export_cond_in_aten_symbolic, test/dynamo/test_export.py::ExportTests::test_export_control_flow_with_getattr, test/dynamo/test_export.py::ExportTests::test_export_decomp, test/dynamo/test_export.py::ExportTests::test_export_decomp_asserts_bad_args, test/dynamo/test_export.py::ExportTests::test_export_defaults_ok, test/dynamo/test_export.py::ExportTests::test_export_dynamic_control_flow_error, test/dynamo/test_export.py::ExportTests::test_export_dynamic_dim_cleanup, test/dynamo/test_export.py::ExportTests::test_export_dynamic_dim_not_1, test/dynamo/test_export.py::ExportTests::test_export_dynamic_dim_range_constraint, test/dynamo/test_export.py::ExportTests::test_export_graph_bypass, test/dynamo/test_export.py::ExportTests::test_export_graph_bypass_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_graph_with_complex_reorder, test/dynamo/test_export.py::ExportTests::test_export_graph_with_complex_reorder_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_graph_with_list, test/dynamo/test_export.py::ExportTests::test_export_graph_with_list_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_identity, test/dynamo/test_export.py::ExportTests::test_export_masking_with_no_grad, test/dynamo/test_export.py::ExportTests::test_export_meta, test/dynamo/test_export.py::ExportTests::test_export_meta_val, test/dynamo/test_export.py::ExportTests::test_export_mismatched_out, test/dynamo/test_export.py::ExportTests::test_export_mismatched_out_2, test/dynamo/test_export.py::ExportTests::test_export_mismatched_out_2_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_mismatched_out_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_module_specify_constraints_signature, test/dynamo/test_export.py::ExportTests::test_export_multi_dynamic_dim_constraint, test/dynamo/test_export.py::ExportTests::test_export_multi_dynamic_dim_unsafe_relationship, test/dynamo/test_export.py::ExportTests::test_export_nn_module_stack_patched_module, test/dynamo/test_export.py::ExportTests::test_export_no_raise, test/dynamo/test_export.py::ExportTests::test_export_no_tensor_computation_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_pass_arg_by_name, test/dynamo/test_export.py::ExportTests::test_export_pass_arg_by_name_star_args, test/dynamo/test_export.py::ExportTests::test_export_persist_assert, test/dynamo/test_export.py::ExportTests::test_export_preserve_constraints_as_metadata_tensor, test/dynamo/test_export.py::ExportTests::test_export_preserves_nn_module_stack_for_get_attr, test/dynamo/test_export.py::ExportTests::test_export_raise_guard_full_constraint, test/dynamo/test_export.py::ExportTests::test_export_raise_guard_partial_constraint, test/dynamo/test_export.py::ExportTests::test_export_raise_on_relationship, test/dynamo/test_export.py::ExportTests::test_export_shape_control_flow_1, test/dynamo/test_export.py::ExportTests::test_export_specialized_int, test/dynamo/test_export.py::ExportTests::test_export_symbolic_shape, test/dynamo/test_export.py::ExportTests::test_export_with_args_and_empty_kwargs, test/dynamo/test_export.py::ExportTests::test_export_with_args_with_default_None, test/dynamo/test_export.py::ExportTests::test_export_with_args_with_default_float, test/dynamo/test_export.py::ExportTests::test_export_with_args_with_default_tensor, test/dynamo/test_export.py::ExportTests::test_export_with_args_with_default_tuple, test/dynamo/test_export.py::ExportTests::test_export_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_export_with_builtin_op_on_assume_constant, test/dynamo/test_export.py::ExportTests::test_export_with_cond_branches_calling_methods, test/dynamo/test_export.py::ExportTests::test_export_with_cond_closure, test/dynamo/test_export.py::ExportTests::test_export_with_cond_dynamic_shape_pred, test/dynamo/test_export.py::ExportTests::test_export_with_cond_with_closed_function, test/dynamo/test_export.py::ExportTests::test_export_with_constant_dict_values, test/dynamo/test_export.py::ExportTests::test_export_with_constant_free_function, test/dynamo/test_export.py::ExportTests::test_export_with_constant_free_function_and_class_method, test/dynamo/test_export.py::ExportTests::test_export_with_constant_free_function_and_class_method_multiarg, test/dynamo/test_export.py::ExportTests::test_export_with_constant_free_function_and_class_method_multiarg_diff, test/dynamo/test_export.py::ExportTests::test_export_with_constant_global_function, test/dynamo/test_export.py::ExportTests::test_export_with_constant_in_unspecialized_nn_module, test/dynamo/test_export.py::ExportTests::test_export_with_constant_list_nonzero, test/dynamo/test_export.py::ExportTests::test_export_with_constant_list_nonzero_free_function, test/dynamo/test_export.py::ExportTests::test_export_with_constant_method_on_module, test/dynamo/test_export.py::ExportTests::test_export_with_constant_method_on_module_invoke_twice, test/dynamo/test_export.py::ExportTests::test_export_with_constant_none_control_flow, test/dynamo/test_export.py::ExportTests::test_export_with_constant_none_control_flow_free_func, test/dynamo/test_export.py::ExportTests::test_export_with_constant_not_none_control_flow, test/dynamo/test_export.py::ExportTests::test_export_with_constant_not_none_control_flow_free_func, test/dynamo/test_export.py::ExportTests::test_export_with_constant_not_none_control_flow_pos, test/dynamo/test_export.py::ExportTests::test_export_with_constant_not_return_const, test/dynamo/test_export.py::ExportTests::test_export_with_constant_tuple_nonzero, test/dynamo/test_export.py::ExportTests::test_export_with_functools_wrapped_fn, test/dynamo/test_export.py::ExportTests::test_export_with_functools_wrapped_method, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs_and_empty_args, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs_with_default_None, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs_with_default_float, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs_with_default_tensor, test/dynamo/test_export.py::ExportTests::test_export_with_kwargs_with_default_tuple, test/dynamo/test_export.py::ExportTests::test_export_with_map_cond, test/dynamo/test_export.py::ExportTests::test_export_with_map_zero_sized_tensor, test/dynamo/test_export.py::ExportTests::test_export_with_map_zero_sized_tensor_suppress_errors, test/dynamo/test_export.py::ExportTests::test_export_with_module_layer, test/dynamo/test_export.py::ExportTests::test_export_with_nonzero_static, test/dynamo/test_export.py::ExportTests::test_export_with_shallow_list_copy_with_side_effects, test/dynamo/test_export.py::ExportTests::test_export_with_shallow_list_copy_wo_side_effects, test/dynamo/test_export.py::ExportTests::test_export_with_stack_trace, test/dynamo/test_export.py::ExportTests::test_export_with_symbool_inputs, test/dynamo/test_export.py::ExportTests::test_export_with_wrapped_fn, test/dynamo/test_export.py::ExportTests::test_exported_graph_serialization, test/dynamo/test_export.py::ExportTests::test_func_return, test/dynamo/test_export.py::ExportTests::test_func_return_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_fx_pytree, test/dynamo/test_export.py::ExportTests::test_immutable_list_dict, test/dynamo/test_export.py::ExportTests::test_input_container_type, test/dynamo/test_export.py::ExportTests::test_input_global, test/dynamo/test_export.py::ExportTests::test_input_global_multiple_access, test/dynamo/test_export.py::ExportTests::test_input_nonlocal, test/dynamo/test_export.py::ExportTests::test_input_unused_nonlocal_ok, test/dynamo/test_export.py::ExportTests::test_list_contains, test/dynamo/test_export.py::ExportTests::test_list_not_contains, test/dynamo/test_export.py::ExportTests::test_list_unpack, test/dynamo/test_export.py::ExportTests::test_list_unpack_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_map_cond_param_buffer_lifted, test/dynamo/test_export.py::ExportTests::test_mixed_real_and_fake_inputs, test/dynamo/test_export.py::ExportTests::test_multiple_outputs_op_with_evaluator, test/dynamo/test_export.py::ExportTests::test_nested_cond_op_param_buffer_lifted, test/dynamo/test_export.py::ExportTests::test_no_tensor_computation, test/dynamo/test_export.py::ExportTests::test_no_tensor_computation_2, test/dynamo/test_export.py::ExportTests::test_no_tensor_computation_2_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_no_tensor_computation_fail, test/dynamo/test_export.py::ExportTests::test_not_functionalize, test/dynamo/test_export.py::ExportTests::test_param_buffer_safe_from_mutation_recurse, test/dynamo/test_export.py::ExportTests::test_param_buffer_safe_from_mutation_simple, test/dynamo/test_export.py::ExportTests::test_pre_dispatch_simple, test/dynamo/test_export.py::ExportTests::test_predispatch_with_for_out_dtype, test/dynamo/test_export.py::ExportTests::test_predispatch_with_for_out_dtype_nested, test/dynamo/test_export.py::ExportTests::test_predispatch_with_higher_order, test/dynamo/test_export.py::ExportTests::test_predispatch_with_higher_order_nested, test/dynamo/test_export.py::ExportTests::test_preserve_fx_node_metadata, test/dynamo/test_export.py::ExportTests::test_preserve_fx_node_metadata_graph_break, test/dynamo/test_export.py::ExportTests::test_preserve_fx_node_metadata_inline, test/dynamo/test_export.py::ExportTests::test_preserve_fx_node_metadata_recompile, test/dynamo/test_export.py::ExportTests::test_remove_redundant_dynamic_dim_in_error_message, test/dynamo/test_export.py::ExportTests::test_retracibility, test/dynamo/test_export.py::ExportTests::test_retracibility_dict_container_inp_out, test/dynamo/test_export.py::ExportTests::test_retracibility_nested_list_out, test/dynamo/test_export.py::ExportTests::test_round_dynamic_shapes, test/dynamo/test_export.py::ExportTests::test_strict_fake_tensor_prop_real_tensors, test/dynamo/test_export.py::ExportTests::test_subclass_parameters, test/dynamo/test_export.py::ExportTests::test_sum_param, test/dynamo/test_export.py::ExportTests::test_sym_contains, test/dynamo/test_export.py::ExportTests::test_symbolic_tracing_within_fake_mode_with_constraints, test/dynamo/test_export.py::ExportTests::test_symbolic_tracing_within_fake_mode_with_constraints_with_parameters, test/dynamo/test_export.py::ExportTests::test_symbool, test/dynamo/test_export.py::ExportTests::test_torch_inference_mode_ctx, test/dynamo/test_export.py::ExportTests::test_trivial_constraint, test/dynamo/test_export.py::ExportTests::test_uncaptured_higher_order_op_error_not_suppresed, test/dynamo/test_export.py::ExportTests::test_untracked_inputs_in_constraints, test/dynamo/test_export.py::ExportTests::test_zeroes_in_and_out_different_shape_on_test, test/dynamo/test_export.py::ExportTests::test_zeroes_in_and_out_different_shape_on_test_with_aten_graph, test/dynamo/test_export.py::ExportTests::test_zeroes_in_new_shape_scalar_out, test/dynamo/test_export.py::ExportTests::test_zeroes_in_new_shape_scalar_out_permute, test/dynamo/test_export.py::ExportTests::test_zeroes_in_new_shape_scalar_out_permute_dupe_and_bypass, test/dynamo/test_export.py::ExportTestsDeviceCUDA::test_export_fast_binary_broadcast_check_cuda, test/dynamo/test_export.py::ExportTestsDeviceCUDA::test_export_fast_binary_broadcast_check_unbacked_cuda, test/dynamo/test_export.py::ExportTestsDeviceCUDA::test_export_with_parameters_cuda 2025-12-04T11:31:38.5693245Z 2025-12-04T11:31:38.5693535Z Finished dynamo/test_export 1/1 ... [2025-12-04 11:31:38.558108][8377.455676828], took 0.49min 2025-12-04T11:31:38.5694611Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_export/dynamo.test_export-8a85a11820ca2c3e.xml 2025-12-04T11:31:38.6443742Z Running dynamo/test_python_dispatcher 1/1 ... [2025-12-04 11:31:38.643976][8377.541544481] 2025-12-04T11:31:38.6444331Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:31:38.6446522Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_python_dispatcher.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:31:38.644300] 2025-12-04T11:31:43.5681383Z 2025-12-04T11:31:43.5682437Z dynamo/test_python_dispatcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_python_dispatcher_1.1_22f9013b571f1b8c_.log 2025-12-04T11:31:43.5685306Z Running 6 items in this shard: test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_dispatch_key1, test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_dispatch_key2, test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_dispatch_key3, test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_dispatch_key4, test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_dispatch_key_set_guard, test/dynamo/test_python_dispatcher.py::PythonDispatcherTests::test_functorch_interpreter 2025-12-04T11:31:43.5687468Z 2025-12-04T11:31:43.5687778Z Finished dynamo/test_python_dispatcher 1/1 ... [2025-12-04 11:31:43.567765][8382.465334558], took 0.08min 2025-12-04T11:31:43.5748654Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_python_dispatcher/dynamo.test_python_dispatcher-3db405fd4b8ecc70.xml 2025-12-04T11:31:43.6059277Z Running export/test_swap 1/1 ... [2025-12-04 11:31:43.605586][8382.503155766] 2025-12-04T11:31:43.6059813Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:31:43.6062577Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_swap.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:31:43.605906] 2025-12-04T11:31:50.2823696Z 2025-12-04T11:31:50.2824910Z export/test_swap 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_swap_1.1_f1e06ac76a1c3f52_.log 2025-12-04T11:31:50.2831031Z Running 20 items in this shard: test/export/test_swap.py::TestSwap_nonstrict::test_custom_input_args, test/export/test_swap.py::TestSwap_nonstrict::test_custom_input_kwargs, test/export/test_swap.py::TestSwap_nonstrict::test_custom_input_kwargs_use_private, test/export/test_swap.py::TestSwap_nonstrict::test_custom_output, test/export/test_swap.py::TestSwap_nonstrict::test_dedup_sym_size, test/export/test_swap.py::TestSwap_nonstrict::test_nested_leaf, test/export/test_swap.py::TestSwap_nonstrict::test_remove_duplicate_pytree_different_order, test/export/test_swap.py::TestSwap_nonstrict::test_remove_duplicate_pytree_simple, test/export/test_swap.py::TestSwap_nonstrict::test_unflatten_preserve_signature, test/export/test_swap.py::TestSwap_nonstrict::test_unflatten_preserve_with_unused_input, test/export/test_swap.py::TestSwap_strict::test_custom_input_args, test/export/test_swap.py::TestSwap_strict::test_custom_input_kwargs, test/export/test_swap.py::TestSwap_strict::test_custom_input_kwargs_use_private, test/export/test_swap.py::TestSwap_strict::test_custom_output, test/export/test_swap.py::TestSwap_strict::test_dedup_sym_size, test/export/test_swap.py::TestSwap_strict::test_nested_leaf, test/export/test_swap.py::TestSwap_strict::test_remove_duplicate_pytree_different_order, test/export/test_swap.py::TestSwap_strict::test_remove_duplicate_pytree_simple, test/export/test_swap.py::TestSwap_strict::test_unflatten_preserve_signature, test/export/test_swap.py::TestSwap_strict::test_unflatten_preserve_with_unused_input 2025-12-04T11:31:50.2836998Z 2025-12-04T11:31:50.2837248Z Finished export/test_swap 1/1 ... [2025-12-04 11:31:50.282051][8389.179617667], took 0.11min 2025-12-04T11:31:50.2896088Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/export.test_swap/export.test_swap-7059f70fd3ff66c0.xml 2025-12-04T11:31:50.3644979Z Running export/test_unflatten 1/1 ... [2025-12-04 11:31:50.364122][8389.261689982] 2025-12-04T11:31:50.3645488Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:31:50.3648098Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_unflatten.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:31:50.364449] 2025-12-04T11:32:04.9563253Z 2025-12-04T11:32:04.9564218Z export/test_unflatten 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_unflatten_1.1_af789f1c183dcc4f_.log 2025-12-04T11:32:04.9573948Z Running 29 items in this shard: test/export/test_unflatten.py::TestUnflatten::test_assert_tensor_metadata_stack, test/export/test_unflatten.py::TestUnflatten::test_attr_as_submod_input, test/export/test_unflatten.py::TestUnflatten::test_dedup_sym_size, test/export/test_unflatten.py::TestUnflatten::test_double_nested_submodule, test/export/test_unflatten.py::TestUnflatten::test_duplicate_placeholder, test/export/test_unflatten.py::TestUnflatten::test_fx_trace, test/export/test_unflatten.py::TestUnflatten::test_nested_leaf_non_strict, test/export/test_unflatten.py::TestUnflatten::test_placeholder_and_get_attr_ordering_after_unflattened, test/export/test_unflatten.py::TestUnflatten::test_simple_alias, test/export/test_unflatten.py::TestUnflatten::test_unflatten_buffer_mutation, test/export/test_unflatten.py::TestUnflatten::test_unflatten_constant_obj, test/export/test_unflatten.py::TestUnflatten::test_unflatten_constant_tensor, test/export/test_unflatten.py::TestUnflatten::test_unflatten_container_type, test/export/test_unflatten.py::TestUnflatten::test_unflatten_eager, test/export/test_unflatten.py::TestUnflatten::test_unflatten_empty_branch, test/export/test_unflatten.py::TestUnflatten::test_unflatten_nested, test/export/test_unflatten.py::TestUnflatten::test_unflatten_nested_access, test/export/test_unflatten.py::TestUnflatten::test_unflatten_none, test/export/test_unflatten.py::TestUnflatten::test_unflatten_param_list_dict, test/export/test_unflatten.py::TestUnflatten::test_unflatten_preserve_signature, test/export/test_unflatten.py::TestUnflatten::test_unflatten_preserve_with_unused_input, test/export/test_unflatten.py::TestUnflatten::test_unflatten_requires_grad_param, test/export/test_unflatten.py::TestUnflatten::test_unflatten_root_module_type, test/export/test_unflatten.py::TestUnflatten::test_unflatten_shared_submodule, test/export/test_unflatten.py::TestUnflatten::test_unflatten_skipped_call_module, test/export/test_unflatten.py::TestUnflatten::test_unflatten_submodule_ordering, test/export/test_unflatten.py::TestUnflatten::test_unflatten_with_inplace_compile, test/export/test_unflatten.py::TestUnflatten::test_unflatten_wrong_input, test/export/test_unflatten.py::TestUnflatten::test_unflattened_module_nodes_has_meta_val 2025-12-04T11:32:04.9582595Z 2025-12-04T11:32:04.9582994Z Finished export/test_unflatten 1/1 ... [2025-12-04 11:32:04.955978][8403.853546639], took 0.24min 2025-12-04T11:32:04.9636646Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/export.test_unflatten/export.test_unflatten-2fe9b333652b1d69.xml 2025-12-04T11:32:05.0317933Z Running dynamo/test_verify_correctness 1/1 ... [2025-12-04 11:32:05.031359][8403.92892747] 2025-12-04T11:32:05.0318626Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:32:05.0321023Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_verify_correctness.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:32:05.031705] 2025-12-04T11:32:09.6549351Z 2025-12-04T11:32:09.6550904Z dynamo/test_verify_correctness 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_verify_correctness_1.1_80f11db651815ea1_.log 2025-12-04T11:32:09.6553110Z Running 4 items in this shard: test/dynamo/test_verify_correctness.py::TestVerifyCorrectness::test_example_inputs, test/dynamo/test_verify_correctness.py::TestVerifyCorrectness::test_incorrect_verify_false, test/dynamo/test_verify_correctness.py::TestVerifyCorrectness::test_incorrect_verify_true, test/dynamo/test_verify_correctness.py::TestVerifyCorrectness::test_torchscript 2025-12-04T11:32:09.6554709Z 2025-12-04T11:32:09.6555023Z Finished dynamo/test_verify_correctness 1/1 ... [2025-12-04 11:32:09.654587][8408.552156216], took 0.08min 2025-12-04T11:32:09.6625132Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_verify_correctness/dynamo.test_verify_correctness-0cd0444895d93ef1.xml 2025-12-04T11:32:09.6968900Z Running inductor/test_fxir_backend 1/1 ... [2025-12-04 11:32:09.696568][8408.59413706] 2025-12-04T11:32:09.6969401Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:32:09.6972812Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fxir_backend.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:32:09.696929] 2025-12-04T11:33:02.9671138Z 2025-12-04T11:33:02.9672612Z inductor/test_fxir_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fxir_backend_1.1_3198189fbbe435ae_.log 2025-12-04T11:33:02.9708737Z Running 73 items in this shard: test/inductor/test_fxir_backend.py::FxirTestCase::test_autotune_enable_tuning_False_use_dynamic_shapes_False, test/inductor/test_fxir_backend.py::FxirTestCase::test_autotune_enable_tuning_False_use_dynamic_shapes_True, test/inductor/test_fxir_backend.py::FxirTestCase::test_autotune_enable_tuning_True_use_dynamic_shapes_False, test/inductor/test_fxir_backend.py::FxirTestCase::test_autotune_enable_tuning_True_use_dynamic_shapes_True, test/inductor/test_fxir_backend.py::FxirTestCase::test_backward, test/inductor/test_fxir_backend.py::FxirTestCase::test_basic, test/inductor/test_fxir_backend.py::FxirTestCase::test_cat_inputs, test/inductor/test_fxir_backend.py::FxirTestCase::test_cat_reinterpret_view, test/inductor/test_fxir_backend.py::FxirTestCase::test_cat_to_alloc, test/inductor/test_fxir_backend.py::FxirTestCase::test_cat_views, test/inductor/test_fxir_backend.py::FxirTestCase::test_cond_no_operands_pred_False, test/inductor/test_fxir_backend.py::FxirTestCase::test_cond_no_operands_pred_True, test/inductor/test_fxir_backend.py::FxirTestCase::test_cond_subgraph_pred_False, test/inductor/test_fxir_backend.py::FxirTestCase::test_cond_subgraph_pred_True, test/inductor/test_fxir_backend.py::FxirTestCase::test_cpp_raises, test/inductor/test_fxir_backend.py::FxirTestCase::test_custom_compiler, test/inductor/test_fxir_backend.py::FxirTestCase::test_custom_triton, test/inductor/test_fxir_backend.py::FxirTestCase::test_debug, test/inductor/test_fxir_backend.py::FxirTestCase::test_device_type, test/inductor/test_fxir_backend.py::FxirTestCase::test_duplicate_input, test/inductor/test_fxir_backend.py::FxirTestCase::test_dynamic_launch_grid_calc, test/inductor/test_fxir_backend.py::FxirTestCase::test_dynamic_shapes_and_strides, test/inductor/test_fxir_backend.py::FxirTestCase::test_dynamic_shapes_precomputed_size, test/inductor/test_fxir_backend.py::FxirTestCase::test_dynamic_shapes_with_padding_shape0, test/inductor/test_fxir_backend.py::FxirTestCase::test_dynamic_shapes_with_padding_shape1, test/inductor/test_fxir_backend.py::FxirTestCase::test_dynamic_shapes_with_padding_shape2, test/inductor/test_fxir_backend.py::FxirTestCase::test_export_const_placeholder_const_1, test/inductor/test_fxir_backend.py::FxirTestCase::test_export_const_placeholder_const_1_5, test/inductor/test_fxir_backend.py::FxirTestCase::test_extern, test/inductor/test_fxir_backend.py::FxirTestCase::test_extern_multi_output, test/inductor/test_fxir_backend.py::FxirTestCase::test_fallback, test/inductor/test_fxir_backend.py::FxirTestCase::test_fallback_tuple_constant_arg, test/inductor/test_fxir_backend.py::FxirTestCase::test_free, test/inductor/test_fxir_backend.py::FxirTestCase::test_index_put_fallback, test/inductor/test_fxir_backend.py::FxirTestCase::test_multiple_kernels, test/inductor/test_fxir_backend.py::FxirTestCase::test_output_slice_view, test/inductor/test_fxir_backend.py::FxirTestCase::test_reshape_output, test/inductor/test_fxir_backend.py::FxirTestCase::test_scatter_fallback_scalar_src, test/inductor/test_fxir_backend.py::FxirTestCase::test_scatter_reduce_fallback, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_aoti_fx_add, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_aoti_fx_const, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_aoti_fx_dynamic, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_aoti_fx_linear, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_cond_dynamic_shape_pred_scalar_closure_length_4, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_cond_dynamic_shape_pred_scalar_closure_length_8, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_cond_multi_inputs_and_outputs_pred_False, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_cond_multi_inputs_and_outputs_pred_True, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_const_folded_subgraph, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_custom_backend, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_custom_triton_autotune_dynamic, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_dims_dynamic_outer_static_padded_inner, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_dynamic_input_expr_expr0, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_dynamic_input_expr_expr1, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_dynamic_scalar_output, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_item_dynamic_False_input__1_5, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_item_dynamic_False_input__2, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_item_dynamic_False_input__False, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_item_dynamic_True_input__1_5, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_item_dynamic_True_input__2, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_item_dynamic_True_input__False, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_mismatched_branch_dynamic_pred_False, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_mismatched_branch_dynamic_pred_True, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_reshape_dynamic_ph, test/inductor/test_fxir_backend.py::AOTFxirTestCase::test_reshape_dynamic_tmd, test/inductor/test_fxir_backend.py::TestReplaceFloorDiv::test_launch_grid_dynamic_padding, test/inductor/test_fxir_backend.py::TestReplaceFloorDiv::test_no_distribute_mul_floordiv, test/inductor/test_fxir_backend.py::TestReplaceFloorDiv::test_no_rewrite_div, test/inductor/test_fxir_backend.py::TestReplaceFloorDiv::test_rational_multi_pows, test/inductor/test_fxir_backend.py::TestReplaceFloorDiv::test_rewrite_floor_div_mul_pow, test/inductor/test_fxir_backend.py::TestReplaceFloorDiv::test_rewrite_floor_div_mul_rational, test/inductor/test_fxir_backend.py::TestReplaceFloorDiv::test_rewrite_floor_div_nested, test/inductor/test_fxir_backend.py::TestReplaceFloorDiv::test_rewrite_floor_div_rational_const, test/inductor/test_fxir_backend.py::TestReplaceFloorDiv::test_variable_exp 2025-12-04T11:33:02.9745289Z 2025-12-04T11:33:02.9745746Z Finished inductor/test_fxir_backend 1/1 ... [2025-12-04 11:33:02.966855][8461.864422529], took 0.89min 2025-12-04T11:33:02.9766194Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_fxir_backend/inductor.test_fxir_backend-3637a5683264f627.xml 2025-12-04T11:33:03.0631211Z Running dynamo/test_flat_apply 1/1 ... [2025-12-04 11:33:03.062655][8461.960222556] 2025-12-04T11:33:03.0632281Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:33:03.0633857Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_flat_apply.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:03.062997] 2025-12-04T11:33:07.5868080Z 2025-12-04T11:33:07.5869280Z dynamo/test_flat_apply 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_flat_apply_1.1_9a81e6ad28eceb5e_.log 2025-12-04T11:33:07.5872261Z Running 4 items in this shard: test/dynamo/test_flat_apply.py::FlatApplyTests::test_non_tensor_output, test/dynamo/test_flat_apply.py::FlatApplyTests::test_nonstrict_trace_captured_tensor_post_aot_graph, test/dynamo/test_flat_apply.py::FlatApplyTests::test_nonstrict_trace_dynamo_graph, test/dynamo/test_flat_apply.py::FlatApplyTests::test_simple 2025-12-04T11:33:07.5874363Z 2025-12-04T11:33:07.5874823Z Finished dynamo/test_flat_apply 1/1 ... [2025-12-04 11:33:07.586469][8466.484034666], took 0.08min 2025-12-04T11:33:07.5948084Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_flat_apply/dynamo.test_flat_apply-5b8baa80f2add320.xml 2025-12-04T11:33:07.6287275Z Running dynamo/test_input_attr_tracking 1/1 ... [2025-12-04 11:33:07.628257][8466.525827455] 2025-12-04T11:33:07.6288162Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:33:07.6289473Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_input_attr_tracking.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:07.628568] 2025-12-04T11:33:17.6637704Z 2025-12-04T11:33:17.6639290Z dynamo/test_input_attr_tracking 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_input_attr_tracking_1.1_b95b69d0a98c91c6_.log 2025-12-04T11:33:17.6648045Z Running 12 items in this shard: test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_complex_attr_access_with_graph_breaks, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_complex_attr_access_with_inline_reconstruct, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_complex_attr_access_without_graph_breaks, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_const_property_assigned_on_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_const_property_on_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_guards_correctly_property_assigned_on_tensor_type_change, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_guards_correctly_property_assigned_on_tensor_type_change_inductor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_set_data_on_input_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_set_data_on_scoped_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_set_data_on_user_defined_class_input_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_tensor_property_assigned_on_tensor, test/dynamo/test_input_attr_tracking.py::TestInputAttrTracking::test_tensor_property_on_tensor 2025-12-04T11:33:17.6656151Z 2025-12-04T11:33:17.6656625Z Finished dynamo/test_input_attr_tracking 1/1 ... [2025-12-04 11:33:17.663386][8476.560954568], took 0.17min 2025-12-04T11:33:17.6721125Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_input_attr_tracking/dynamo.test_input_attr_tracking-e7186c64c589dbc7.xml 2025-12-04T11:33:17.7417710Z Running dynamo/test_graph_deduplication 1/1 ... [2025-12-04 11:33:17.741358][8476.638926285] 2025-12-04T11:33:17.7418527Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:33:17.7421098Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_graph_deduplication.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:17.741693] 2025-12-04T11:33:29.4295302Z 2025-12-04T11:33:29.4296417Z dynamo/test_graph_deduplication 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_graph_deduplication_1.1_06acfa57009690f5_.log 2025-12-04T11:33:29.4304941Z Running 18 items in this shard: test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_autocast_ordering, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_arg_and_additional_deps, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_complex, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_no_cycle, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_simple, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_single_node, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_cycle_detection_two_node, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_dependent_subgraphs, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_input_aliasing, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_input_mutation, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_multiple_subgraphs, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_mutation_ordering, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_output_nodes_last, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_param_transfer_to_submodule, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_single_subgraph, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_single_subgraph2, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_tuple_inputs, test/dynamo/test_graph_deduplication.py::GraphDededuplicationTests::test_tuple_return 2025-12-04T11:33:29.4312028Z 2025-12-04T11:33:29.4312348Z Finished dynamo/test_graph_deduplication 1/1 ... [2025-12-04 11:33:29.429178][8488.326747265], took 0.19min 2025-12-04T11:33:29.4374505Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_graph_deduplication/dynamo.test_graph_deduplication-146b0afa707c650e.xml 2025-12-04T11:33:29.5407921Z Running inductor/test_distributed_patterns 1/1 ... [2025-12-04 11:33:29.540402][8488.437970136] 2025-12-04T11:33:29.5408632Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:33:29.5412292Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_distributed_patterns.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:29.540726] 2025-12-04T11:33:56.3069434Z 2025-12-04T11:33:56.3070752Z inductor/test_distributed_patterns 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_distributed_patterns_1.1_7e5bf6d90d3b740d_.log 2025-12-04T11:33:56.3080504Z Running 20 items in this shard: test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_fake_distributed_aot_eager, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_fake_distributed_inductor, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_intermediate_hook_with_closure, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_intermediate_hook_with_nested_closure, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_aot, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_eager, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_inductor, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_multi_layers, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return1, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return2, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return3, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return4, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_nonzero_cpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_nonzero_gpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_zero_cpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_zero_gpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_preserve_version_counter1, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_preserve_version_counter2, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_set_version_counter1, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_set_version_counter2 2025-12-04T11:33:56.3088979Z 2025-12-04T11:33:56.3089319Z Finished inductor/test_distributed_patterns 1/1 ... [2025-12-04 11:33:56.306644][8515.204212857], took 0.45min 2025-12-04T11:33:56.3151986Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_distributed_patterns/inductor.test_distributed_patterns-a225771515eff859.xml 2025-12-04T11:33:56.3946736Z Running dynamo/test_structured_trace 1/1 ... [2025-12-04 11:33:56.394300][8515.291868446] 2025-12-04T11:33:56.3947422Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:33:56.3950719Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_structured_trace.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:56.394654] 2025-12-04T11:34:31.6970256Z 2025-12-04T11:34:31.6971496Z dynamo/test_structured_trace 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_structured_trace_1.1_a4f4f804030b4054_.log 2025-12-04T11:34:31.6982586Z Running 29 items in this shard: test/dynamo/test_structured_trace.py::StructuredTraceTest::test_chromium_event, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_codecache, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_collective_schedule_empty, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_collective_schedule_real, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_compile_id_serialization_deserialization, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_compiled_autograd_attribution, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_compiled_autograd_chromium, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_compiled_autograd_id, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_cudagraphs, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_ddp_graphs, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_dump_file, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_dynamo_error, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_example_fn, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_example_training_fn, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_graph_breaks, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_graph_execution_order, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_graph_sizes_dynamic, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_guards_recompiles, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_inductor_error, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_make_fx_fail_partial, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_recompile_user_contexts, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_recompile_user_contexts_iteration, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_recompiles, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_runtime_estimates_mixed, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_runtime_estimates_simple, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_schedule, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_tensor_metadata_logging, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_tensor_metadata_logging_dynamic_shapes, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_tensor_metadata_logging_multiple_ops 2025-12-04T11:34:31.6993120Z 2025-12-04T11:34:31.6993428Z Finished dynamo/test_structured_trace 1/1 ... [2025-12-04 11:34:31.696794][8550.59436276], took 0.59min 2025-12-04T11:34:31.7056084Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_structured_trace/dynamo.test_structured_trace-36a47fd4af733696.xml 2025-12-04T11:34:31.7792152Z Running dynamo/test_fake_distributed 1/1 ... [2025-12-04 11:34:31.778786][8550.676354062] 2025-12-04T11:34:31.7792851Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:34:31.7796416Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_fake_distributed.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:34:31.779130] 2025-12-04T11:34:37.1537943Z 2025-12-04T11:34:37.1538994Z dynamo/test_fake_distributed 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_fake_distributed_1.1_e5311a05636d963a_.log 2025-12-04T11:34:37.1540839Z Running 3 items in this shard: test/dynamo/test_fake_distributed.py::TestFakeDistributed::test_all_to_all_single_autograd, test/dynamo/test_fake_distributed.py::TestFakeDistributed::test_device_mesh_flatten, test/dynamo/test_fake_distributed.py::TestFakeDistributed::test_device_mesh_get_local_rank 2025-12-04T11:34:37.1542040Z 2025-12-04T11:34:37.1542354Z Finished dynamo/test_fake_distributed 1/1 ... [2025-12-04 11:34:37.153476][8556.051045621], took 0.09min 2025-12-04T11:34:37.1621943Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_fake_distributed/dynamo.test_fake_distributed-21ff8fa13be61f59.xml 2025-12-04T11:34:37.2006547Z Running export/test_nativert 1/1 ... [2025-12-04 11:34:37.200171][8556.097741096] 2025-12-04T11:34:37.2007165Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:34:37.2008693Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_nativert.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:34:37.200489] 2025-12-04T11:34:44.0286981Z 2025-12-04T11:34:44.0287838Z export/test_nativert 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_nativert_1.1_d6bd2abd6f18e6a8_.log 2025-12-04T11:34:44.0290177Z Running 6 items in this shard: test/export/test_nativert.py::TestNativeRT::test_aoti_0_cpu, test/export/test_nativert.py::TestNativeRT::test_aoti_1_cpu, test/export/test_nativert.py::TestNativeRT::test_aoti_2_cpu, test/export/test_nativert.py::TestNativeRT::test_aoti_3_cuda, test/export/test_nativert.py::TestNativeRT::test_aoti_4_cuda, test/export/test_nativert.py::TestNativeRT::test_aoti_5_cuda 2025-12-04T11:34:44.0291661Z 2025-12-04T11:34:44.0291933Z Finished export/test_nativert 1/1 ... [2025-12-04 11:34:44.028331][8562.925900729], took 0.11min 2025-12-04T11:34:44.0375358Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/export.test_nativert/export.test_nativert-11993ef1b36ec884.xml 2025-12-04T11:34:44.1199865Z Running export/test_hop 1/1 ... [2025-12-04 11:34:44.119565][8563.017133534] 2025-12-04T11:34:44.1200495Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:34:44.1203363Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_hop.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:34:44.119933] 2025-12-04T11:35:01.1646122Z 2025-12-04T11:35:01.1647578Z export/test_hop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_hop_1.1_a564c4d3545a0718_.log 2025-12-04T11:35:01.1665335Z Running 44 items in this shard: test/export/test_hop.py::TestHOPCUDA::test_aot_export_auto_functionalize_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_cond_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_flex_attention_backward_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_flex_attention_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_invoke_quant_packed_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_invoke_quant_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_invoke_subgraph_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_local_map_hop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_scan_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_while_loop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_aot_export_while_loop_stack_output_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_auto_functionalize_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_cond_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_flex_attention_backward_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_flex_attention_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_invoke_quant_packed_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_invoke_quant_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_invoke_subgraph_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_local_map_hop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_scan_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_while_loop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_pre_dispatch_export_while_loop_stack_output_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_auto_functionalize_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_cond_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_flex_attention_backward_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_flex_attention_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_invoke_quant_packed_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_invoke_quant_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_invoke_subgraph_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_local_map_hop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_scan_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_while_loop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_retrace_export_while_loop_stack_output_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_auto_functionalize_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_cond_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_flex_attention_backward_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_flex_attention_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_invoke_quant_packed_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_invoke_quant_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_invoke_subgraph_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_local_map_hop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_scan_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_while_loop_simple_cuda_float32, test/export/test_hop.py::TestHOPCUDA::test_serialize_export_while_loop_stack_output_simple_cuda_float32 2025-12-04T11:35:01.1680390Z 2025-12-04T11:35:01.1680656Z Finished export/test_hop 1/1 ... [2025-12-04 11:35:01.164623][8580.062191624], took 0.28min 2025-12-04T11:35:01.1737275Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/export.test_hop/export.test_hop-d9a65f9d494897c7.xml 2025-12-04T11:35:01.2596444Z Running test_model_exports_to_core_aten 1/1 ... [2025-12-04 11:35:01.259313][8580.156881434] 2025-12-04T11:35:01.2596944Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:35:01.2600045Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_model_exports_to_core_aten.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:35:01.259635] 2025-12-04T11:35:05.2820951Z 2025-12-04T11:35:05.2822165Z test_model_exports_to_core_aten 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_model_exports_to_core_aten_1.1_6a0e15a24c6821d5_.log 2025-12-04T11:35:05.2824195Z Running 1 items in this shard: test/test_model_exports_to_core_aten.py::TestQuantizePT2EModels::test_vit_aten_export 2025-12-04T11:35:05.2825105Z 2025-12-04T11:35:05.2825562Z Finished test_model_exports_to_core_aten 1/1 ... [2025-12-04 11:35:05.281801][8584.179370045], took 0.07min 2025-12-04T11:35:05.2913188Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_model_exports_to_core_aten/test_model_exports_to_core_aten-46fb204eaca6b9f2.xml 2025-12-04T11:35:05.3230478Z Running dynamo/test_precompile_context 1/1 ... [2025-12-04 11:35:05.322653][8584.220222278] 2025-12-04T11:35:05.3230983Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:35:05.3233617Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_precompile_context.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:35:05.322990] 2025-12-04T11:35:15.8583294Z 2025-12-04T11:35:15.8584273Z dynamo/test_precompile_context 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_precompile_context_1.1_f035b5fe64050d47_.log 2025-12-04T11:35:15.8586024Z Running 3 items in this shard: test/dynamo/test_precompile_context.py::PrecompileContextTests::test_basic, test/dynamo/test_precompile_context.py::PrecompileContextTests::test_editable, test/dynamo/test_precompile_context.py::PrecompileContextTests::test_serialize_by_key 2025-12-04T11:35:15.8587508Z 2025-12-04T11:35:15.8587827Z Finished dynamo/test_precompile_context 1/1 ... [2025-12-04 11:35:15.857916][8594.755483764], took 0.18min 2025-12-04T11:35:15.8675125Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_precompile_context/dynamo.test_precompile_context-7bf3bf27c616b5c5.xml 2025-12-04T11:35:15.9428791Z Running dynamo/test_trace_rules 1/1 ... [2025-12-04 11:35:15.942454][8594.840002776] 2025-12-04T11:35:15.9429401Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:35:15.9431879Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_trace_rules.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:35:15.942798] 2025-12-04T11:35:21.5183570Z 2025-12-04T11:35:21.5184455Z dynamo/test_trace_rules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_trace_rules_1.1_c52604f69a6cd1ab_.log 2025-12-04T11:35:21.5187398Z Running 7 items in this shard: test/dynamo/test_trace_rules.py::TraceRuleTests::test_almost_impossible_missing_name, test/dynamo/test_trace_rules.py::TraceRuleTests::test_force_inline_custom_function, test/dynamo/test_trace_rules.py::TraceRuleTests::test_force_inline_torch_function, test/dynamo/test_trace_rules.py::TraceRuleTests::test_no_special_handlers_for_torch_non_c_bindings, test/dynamo/test_trace_rules.py::TraceRuleTests::test_skipfiles_inlinelist, test/dynamo/test_trace_rules.py::TraceRuleTests::test_torch_name_rule_map_updated, test/dynamo/test_trace_rules.py::TestModuleSurviveSkipFiles::test_module_survive_skip_files 2025-12-04T11:35:21.5189779Z 2025-12-04T11:35:21.5190068Z Finished dynamo/test_trace_rules 1/1 ... [2025-12-04 11:35:21.517775][8600.415341688], took 0.09min 2025-12-04T11:35:21.5276548Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_trace_rules/dynamo.test_trace_rules-ee853126f21c3a7b.xml 2025-12-04T11:35:21.5549106Z Running export/test_upgrader 1/1 ... [2025-12-04 11:35:21.554563][8600.45213236] 2025-12-04T11:35:21.5549570Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:35:21.5552657Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_upgrader.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:35:21.554887] 2025-12-04T11:35:25.2768211Z 2025-12-04T11:35:25.2769278Z export/test_upgrader 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_upgrader_1.1_cd3c4ec05ae92aaa_.log 2025-12-04T11:35:25.2771953Z Running 6 items in this shard: test/export/test_upgrader.py::TestUpgrader::test_field_renaming_chain_from_v0_complete, test/export/test_upgrader.py::TestUpgrader::test_field_renaming_chain_from_v0_missing_field, test/export/test_upgrader.py::TestUpgrader::test_field_renaming_from_v1_partial_chain, test/export/test_upgrader.py::TestUpgrader::test_nn_module_stack_error_handling_invalid_type, test/export/test_upgrader.py::TestUpgrader::test_nn_module_stack_transformation_from_v0, test/export/test_upgrader.py::TestUpgrader::test_nodes_without_metadata_handled_gracefully 2025-12-04T11:35:25.2774102Z 2025-12-04T11:35:25.2774383Z Finished export/test_upgrader 1/1 ... [2025-12-04 11:35:25.276476][8604.174044585], took 0.06min 2025-12-04T11:35:25.2863636Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/export.test_upgrader/export.test_upgrader-f8a313da87b3ac32.xml 2025-12-04T11:35:25.3186227Z Running dynamo/test_hooks 1/1 ... [2025-12-04 11:35:25.318237][8604.215806741] 2025-12-04T11:35:25.3186815Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:35:25.3189398Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_hooks.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:35:25.318586] 2025-12-04T11:35:45.1712198Z 2025-12-04T11:35:45.1713231Z dynamo/test_hooks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_hooks_1.1_38c972f0c8749aa6_.log 2025-12-04T11:35:45.1724815Z Running 34 items in this shard: test/dynamo/test_hooks.py::HooksTests::test_complex_state_mutation_in_intermediary_hooks_same_on_inductor, test/dynamo/test_hooks.py::HooksTests::test_complex_state_mutation_in_intermediary_hooks_same_on_inductor_with_graph_break, test/dynamo/test_hooks.py::HooksTests::test_functools_arg_vary, test/dynamo/test_hooks.py::HooksTests::test_global_module_forward_pre_hook, test/dynamo/test_hooks.py::HooksTests::test_hook_with_closure, test/dynamo/test_hooks.py::HooksTests::test_hook_with_nested_closure, test/dynamo/test_hooks.py::HooksTests::test_input_hooks_same, test/dynamo/test_hooks.py::HooksTests::test_intermediary_hooks, test/dynamo/test_hooks.py::HooksTests::test_intermediary_hooks_same_on_aot_eager, test/dynamo/test_hooks.py::HooksTests::test_intermediary_hooks_same_on_inductor, test/dynamo/test_hooks.py::HooksTests::test_intermediate_hook_with_closure_aot, test/dynamo/test_hooks.py::HooksTests::test_intermediate_hook_with_closure_eager, test/dynamo/test_hooks.py::HooksTests::test_nnmodule_hook_guards, test/dynamo/test_hooks.py::HooksTests::test_no_recompile_on_hook_identity_change, test/dynamo/test_hooks.py::HooksTests::test_no_recompile_on_same_hook, test/dynamo/test_hooks.py::HooksTests::test_post_acc_grad_hook, test/dynamo/test_hooks.py::HooksTests::test_recompile, test/dynamo/test_hooks.py::HooksTests::test_register_hook_partial_guarding, test/dynamo/test_hooks.py::HooksTests::test_removed_handle_return, test/dynamo/test_hooks.py::HooksTests::test_tensor_only_register_hook_in_graph_lambda, test/dynamo/test_hooks.py::HooksTests::test_tensor_only_register_hook_in_graph_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_only_register_hook_in_graph_local_inner, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_global_hook, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_global_hooks_handles_in_list, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_break_handle_lambda, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_break_handle_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_lambda, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_multi_handle_return, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_repeated_handle_not_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_repeated_handle_return, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_multiple_hooks, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_multiple_hooks_handles_in_list, test/dynamo/test_hooks.py::HooksTests::test_wrap_top_frame_with_hooks 2025-12-04T11:35:45.1734859Z 2025-12-04T11:35:45.1735124Z Finished dynamo/test_hooks 1/1 ... [2025-12-04 11:35:45.170856][8624.068424816], took 0.33min 2025-12-04T11:35:45.1810644Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/dynamo.test_hooks/dynamo.test_hooks-b07cb839d1e57fc7.xml 2025-12-04T11:35:45.2643054Z Running export/test_sparse 1/1 ... [2025-12-04 11:35:45.263891][8624.161458864] 2025-12-04T11:35:45.2643621Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:35:45.2646255Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_sparse.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:35:45.264245] 2025-12-04T11:46:08.4795984Z 2025-12-04T11:46:08.4796925Z export/test_sparse 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_sparse_1.1_c3722ceed82c1d25_.log 2025-12-04T11:46:08.4860484Z Running 203 items in this shard: test/export/test_sparse.py::TestSparseProp::test_activation_coo, test/export/test_sparse.py::TestSparseProp::test_activation_csr, test/export/test_sparse.py::TestSparseProp::test_add, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCSR 2025-12-04T11:46:08.4922784Z 2025-12-04T11:46:08.4923070Z Finished export/test_sparse 1/1 ... [2025-12-04 11:46:08.479543][9247.377110561], took 10.39min 2025-12-04T11:46:08.4924086Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/export.test_sparse/export.test_sparse-4d7ad8fd2ec6dc81.xml 2025-12-04T11:46:09.5120247Z Uploading artifacts took 0.93 seconds 2025-12-04T11:46:09.5124467Z Running test_modules 1/4 ... [2025-12-04 11:46:09.512131][9248.409697296] 2025-12-04T11:46:09.5124909Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:46:09.5129498Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '--shard-id=1', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:46:09.512586] 2025-12-04T11:56:05.3642574Z 2025-12-04T11:56:05.3643470Z test_modules 1/4 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_1.4_bac8f06793ddacc2_.log 2025-12-04T11:56:05.3941980Z Running 911 items in this shard: test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveAvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose1d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_forward_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BatchNorm1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BatchNorm3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LSTM_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_TransformerEncoder_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Sigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveAvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_GRUCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_repr_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AvgPool1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm3d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CircularPad2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CircularPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConstantPad2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConstantPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Conv1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Conv2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_FractionalMaxPool2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_FractionalMaxPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GELU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GRUCell_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardshrink_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardshrink_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardtanh_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm1d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm1d_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm1d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm2d_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm3d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_KLDivLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LPPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LPPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LSTMCell_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LSTM_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LSTM_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Linear_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LocalResponseNorm_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LogSigmoid_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LogSoftmax_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LogSoftmax_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MSELoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MarginRankingLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MaxPool1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MaxPool2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiLabelMarginLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiMarginLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiheadAttention_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_RMSNorm_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_RNN_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SiLU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Sigmoid_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Sigmoid_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SmoothL1Loss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SmoothL1Loss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SoftMarginLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softmax2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softsign_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softsign_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Tanh_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Tanhshrink_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Threshold_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoderLayer_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoder_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoder_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoder_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCELoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCEWithLogitsLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm3d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Bilinear_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CTCLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CTCLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CosineEmbeddingLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ELU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ELU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ELU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Embedding_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GELU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GELU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GLU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GLU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRUCell_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRUCell_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GaussianNLLLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GroupNorm_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardshrink_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardshrink_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardshrink_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardswish_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardswish_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardtanh_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HuberLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HuberLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HuberLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm1d_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_L1Loss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Linear_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LocalResponseNorm_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSoftmax_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSoftmax_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MSELoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MarginRankingLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Mish_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Mish_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Mish_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiMarginLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiMarginLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_NLLLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PReLU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PReLU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RMSNorm_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNNCell_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNN_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SELU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Sigmoid_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Sigmoid_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SoftMarginLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmin_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmin_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softplus_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Tanhshrink_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerDecoderLayer_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerDecoderLayer_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Transformer_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Transformer_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad3d_swap_False_set_grad_True_cuda_float32 2025-12-04T11:56:05.4234675Z 2025-12-04T11:56:05.4234920Z Finished test_modules 1/4 ... [2025-12-04 11:56:05.365359][9844.262927004], took 9.93min 2025-12-04T11:56:05.4235785Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_modules/test_modules-8463597ee5493803.xml 2025-12-04T11:56:05.4682577Z Running complex_tensor/test_complex_tensor 1/1 ... [2025-12-04 11:56:05.467795][9844.365362291] 2025-12-04T11:56:05.4683087Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T11:56:05.4686659Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'complex_tensor/test_complex_tensor.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:56:05.468152] 2025-12-04T12:04:10.0568628Z 2025-12-04T12:04:10.0569795Z complex_tensor/test_complex_tensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/complex_tensor.test_complex_tensor_1.1_dd9d0d0b4ce48ad0_.log 2025-12-04T12:04:10.0827324Z Running 578 items in this shard: test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_abs_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_abs_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_abs_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_acos_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_acos_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_acos_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_acosh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_acosh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_acosh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_add_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_add_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_add_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_addmm_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_addmm_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_addmm_decomposed_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_addmm_decomposed_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_all_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_all_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_allclose_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_allclose_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_angle_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_angle_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_angle_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_any_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_any_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_asin_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_asin_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_asin_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_asinh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_asinh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_asinh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_atan_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_atan_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_atan_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_atanh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_atanh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_atanh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_bmm_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_bmm_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cat_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cat_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cat_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_clone_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_clone_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_clone_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_conj_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_conj_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_conj_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_conj_physical_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_conj_physical_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_conj_physical_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_constant_pad_nd_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_constant_pad_nd_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cos_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cos_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cos_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cosh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cosh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cosh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cumprod_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cumprod_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cumsum_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_cumsum_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_diagonal_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_diagonal_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_diagonal_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_diagonal_scatter_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_diagonal_scatter_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_div_no_rounding_mode_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_div_no_rounding_mode_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_div_no_rounding_mode_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_dot_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_dot_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_empty_like_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_empty_like_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_empty_like_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_eq_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_eq_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_eq_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_exp_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_exp_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_exp_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_expand_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_expand_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_expm1_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_expm1_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_flatten_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_flatten_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_flatten_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_flip_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_flip_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_full_like_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_full_like_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_gather_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_gather_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_imag_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_imag_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_imag_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_index_add_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_index_add_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_index_add_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_index_select_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_index_select_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_index_select_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_isclose_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_isclose_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_isfinite_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_isfinite_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_isfinite_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_isinf_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_isinf_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_isinf_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_isnan_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_isnan_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_log1p_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_log1p_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_log_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_log_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_log_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_logical_and_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_logical_and_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_logical_not_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_logical_not_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_logical_or_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_logical_or_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_logical_xor_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_logical_xor_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_masked_fill_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_masked_fill_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_masked_fill_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_masked_scatter_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_masked_scatter_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_mean_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_mean_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_mm_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_mm_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_mul_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_mul_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_mul_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_ne_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_ne_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_neg_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_neg_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_neg_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_new_zeros_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_new_zeros_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_new_zeros_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_permute_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_permute_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_permute_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_pow_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_pow_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_pow_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_prod_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_prod_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_prod_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_randn_like_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_randn_like_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_randn_like_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_real_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_real_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_real_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_reciprocal_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_reciprocal_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_repeat_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_repeat_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_rsqrt_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_rsqrt_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_rsqrt_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_rsub_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_rsub_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_scatter_add_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_scatter_add_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_select_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_select_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_select_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sgn_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sgn_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sgn_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sin_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sin_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sin_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sinh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sinh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sinh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_slice_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_slice_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_slice_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_split_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_split_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_split_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_split_list_args_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_split_list_args_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_split_with_sizes_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_split_with_sizes_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_split_with_sizes_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sqrt_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sqrt_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sqrt_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_squeeze_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_squeeze_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_squeeze_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_squeeze_multiple_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_squeeze_multiple_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_squeeze_multiple_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_stack_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_stack_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_stack_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sub_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sub_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sub_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sum_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sum_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_sum_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_t_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_t_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_tan_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_tan_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_tan_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_tanh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_tanh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_tanh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_to_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_to_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_transpose_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_transpose_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_transpose_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_true_divide_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_true_divide_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_true_divide_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_unsqueeze_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_unsqueeze_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_unsqueeze_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_var_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_var_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_var_unbiased_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_var_unbiased_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_view_as_real_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_view_as_real_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_view_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_view_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_view_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_where_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_where_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_where_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_zero__cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_zero__cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_zeros_like_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_zeros_like_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexTensorCUDA::test_consistency_zeros_like_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_abs_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_acos_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_acosh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_add_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_addmm_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_addmm_decomposed_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_angle_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_asin_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_asinh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_atan_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_atanh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_bmm_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_cat_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_clone_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_conj_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_conj_physical_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_constant_pad_nd_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_cos_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_cosh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_cumprod_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_cumsum_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_diagonal_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_diagonal_scatter_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_div_no_rounding_mode_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_dot_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_exp_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_expand_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_expm1_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_flatten_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_flip_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_gather_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_imag_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_index_add_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_index_select_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_log1p_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_log_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_masked_fill_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_masked_scatter_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_mean_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_mm_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_mul_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_neg_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_permute_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_pow_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_prod_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_real_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_reciprocal_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_repeat_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_rsqrt_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_rsub_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_scatter_add_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_select_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_sgn_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_sin_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_sinh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_slice_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_split_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_split_list_args_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_split_with_sizes_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_sqrt_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_squeeze_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_squeeze_multiple_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_stack_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_sub_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_sum_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_t_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_tan_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_tanh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_to_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_transpose_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_true_divide_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_unsqueeze_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_var_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_var_unbiased_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_view_as_real_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_view_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_where_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCUDA::test_fn_grad_zero__cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_abs_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_abs_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_abs_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_acos_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_acos_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_acos_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_acosh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_acosh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_acosh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_add_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_add_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_add_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_addmm_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_addmm_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_addmm_decomposed_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_addmm_decomposed_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_all_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_all_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_allclose_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_allclose_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_angle_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_angle_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_angle_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_any_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_any_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_asin_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_asin_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_asin_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_asinh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_asinh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_asinh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_atan_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_atan_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_atan_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_atanh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_atanh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_atanh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_bmm_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_bmm_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cat_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cat_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cat_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_clone_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_clone_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_clone_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_conj_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_conj_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_conj_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_conj_physical_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_conj_physical_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_conj_physical_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_constant_pad_nd_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_constant_pad_nd_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cos_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cos_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cos_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cosh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cosh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cosh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cumprod_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cumprod_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cumsum_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_cumsum_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_diagonal_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_diagonal_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_diagonal_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_diagonal_scatter_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_diagonal_scatter_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_div_no_rounding_mode_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_div_no_rounding_mode_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_div_no_rounding_mode_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_dot_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_dot_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_empty_like_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_empty_like_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_empty_like_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_eq_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_eq_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_eq_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_exp_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_exp_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_exp_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_expand_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_expand_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_expm1_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_expm1_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_flatten_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_flatten_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_flatten_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_flip_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_flip_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_full_like_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_full_like_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_gather_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_gather_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_imag_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_imag_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_imag_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_index_add_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_index_add_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_index_add_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_index_select_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_index_select_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_index_select_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_isclose_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_isclose_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_isfinite_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_isfinite_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_isfinite_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_isinf_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_isinf_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_isinf_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_isnan_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_isnan_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_log1p_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_log1p_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_log_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_log_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_log_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_logical_and_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_logical_and_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_logical_not_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_logical_not_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_logical_or_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_logical_or_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_logical_xor_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_logical_xor_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_masked_fill_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_masked_fill_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_masked_fill_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_masked_scatter_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_masked_scatter_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_mean_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_mean_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_mm_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_mm_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_mul_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_mul_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_mul_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_ne_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_ne_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_neg_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_neg_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_neg_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_new_zeros_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_new_zeros_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_new_zeros_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_permute_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_permute_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_permute_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_pow_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_pow_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_pow_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_prod_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_prod_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_prod_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_randn_like_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_randn_like_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_randn_like_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_real_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_real_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_real_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_reciprocal_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_reciprocal_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_repeat_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_repeat_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_rsqrt_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_rsqrt_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_rsqrt_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_rsub_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_rsub_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_scatter_add_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_scatter_add_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_select_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_select_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_select_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sgn_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sgn_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sgn_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sin_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sin_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sin_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sinh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sinh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sinh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_slice_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_slice_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_slice_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_split_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_split_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_split_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_split_list_args_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_split_list_args_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_split_with_sizes_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_split_with_sizes_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_split_with_sizes_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sqrt_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sqrt_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sqrt_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_squeeze_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_squeeze_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_squeeze_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_squeeze_multiple_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_squeeze_multiple_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_squeeze_multiple_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_stack_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_stack_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_stack_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sub_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sub_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sub_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sum_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sum_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_sum_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_t_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_t_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_tan_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_tan_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_tan_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_tanh_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_tanh_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_tanh_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_to_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_to_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_transpose_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_transpose_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_transpose_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_true_divide_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_true_divide_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_true_divide_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_unsqueeze_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_unsqueeze_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_unsqueeze_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_var_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_var_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_var_unbiased_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_var_unbiased_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_view_as_real_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_view_as_real_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_view_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_view_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_view_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_where_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_where_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_where_cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_zero__cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_zero__cuda_complex64, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_zeros_like_cuda_complex128, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_zeros_like_cuda_complex32, test/complex_tensor/test_complex_tensor.py::TestComplexDistributedCUDA::test_distributed_zeros_like_cuda_complex64 2025-12-04T12:04:10.1117930Z 2025-12-04T12:04:10.1118281Z Finished complex_tensor/test_complex_tensor 1/1 ... [2025-12-04 12:04:10.058131][10328.955690261], took 8.08min 2025-12-04T12:04:10.1119423Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/complex_tensor.test_complex_tensor/complex_tensor.test_complex_tensor-8b23cb700c6b2092.xml 2025-12-04T12:04:10.1606227Z Running benchmark_utils/test_benchmark_utils 1/1 ... [2025-12-04 12:04:10.160173][10329.057738344] 2025-12-04T12:04:10.1607038Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T12:04:10.1609202Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'benchmark_utils/test_benchmark_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:04:10.160542] 2025-12-04T12:04:15.6857280Z 2025-12-04T12:04:15.6858264Z benchmark_utils/test_benchmark_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/benchmark_utils.test_benchmark_utils_1.1_08587b185fc67fa8_.log 2025-12-04T12:04:15.6862068Z Running 9 items in this shard: test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_adaptive_timer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_collect_callgrind, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_collect_cpp_callgrind, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_compare, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_cpp_timer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_fuzzer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_manipulate_callgrind_stats, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_timer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_timer_tiny_fast_snippet 2025-12-04T12:04:15.6865579Z 2025-12-04T12:04:15.6865922Z Finished benchmark_utils/test_benchmark_utils 1/1 ... [2025-12-04 12:04:15.685310][10334.582879094], took 0.09min 2025-12-04T12:04:15.6960979Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/benchmark_utils.test_benchmark_utils/benchmark_utils.test_benchmark_utils-2888f3c320a1b2e3.xml 2025-12-04T12:04:15.7308056Z Running torch_np/numpy_tests/core/test_scalarmath 1/1 ... [2025-12-04 12:04:15.730453][10334.628022885] 2025-12-04T12:04:15.7308601Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T12:04:15.7311428Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_scalarmath.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:04:15.730763] 2025-12-04T12:04:40.6365309Z 2025-12-04T12:04:40.6366961Z torch_np/numpy_tests/core/test_scalarmath 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_scalarmath_1.1_9879c80a9340a002_.log 2025-12-04T12:04:40.6453196Z Running 186 items in this shard: test/torch_np/numpy_tests/core/test_scalarmath.py::TestTypes::test_leak, test/torch_np/numpy_tests/core/test_scalarmath.py::TestTypes::test_type_add, test/torch_np/numpy_tests/core/test_scalarmath.py::TestTypes::test_type_create, test/torch_np/numpy_tests/core/test_scalarmath.py::TestTypes::test_types, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBaseMath::test_blocked, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBaseMath::test_lower_align, test/torch_np/numpy_tests/core/test_scalarmath.py::TestPower::test_integers_to_negative_integer_power, test/torch_np/numpy_tests/core/test_scalarmath.py::TestPower::test_large_types, test/torch_np/numpy_tests/core/test_scalarmath.py::TestPower::test_mixed_types, test/torch_np/numpy_tests/core/test_scalarmath.py::TestPower::test_modular_power, test/torch_np/numpy_tests/core/test_scalarmath.py::TestPower::test_small_types, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_float_modulus_corner_cases_dt_d, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_float_modulus_corner_cases_dt_e, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_float_modulus_corner_cases_dt_f, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_float_modulus_exact, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_float_modulus_roundoff, test/torch_np/numpy_tests/core/test_scalarmath.py::TestModulus::test_modulus_basic, test/torch_np/numpy_tests/core/test_scalarmath.py::TestComplexDivision::test_branches, test/torch_np/numpy_tests/core/test_scalarmath.py::TestComplexDivision::test_signed_zeros, test/torch_np/numpy_tests/core/test_scalarmath.py::TestComplexDivision::test_zero_division, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_iinfo_long_values_1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_iinfo_long_values_2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_int_from_long, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_int_raise_behaviour, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_numpy_scalar_relational_operators, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_numpy_scalar_relational_operators_2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestConversion::test_scalar_comparison_to_none, test/torch_np/numpy_tests/core/test_scalarmath.py::TestRepr::test_float_repr, test/torch_np/numpy_tests/core/test_scalarmath.py::TestMultiply::test_no_seq_repeat_basic_array_like, test/torch_np/numpy_tests/core/test_scalarmath.py::TestMultiply::test_seq_repeat, test/torch_np/numpy_tests/core/test_scalarmath.py::TestNegative::test_exceptions, test/torch_np/numpy_tests/core/test_scalarmath.py::TestNegative::test_result, test/torch_np/numpy_tests/core/test_scalarmath.py::TestSubtract::test_exceptions, test/torch_np/numpy_tests/core/test_scalarmath.py::TestSubtract::test_result, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_builtin_abs_dtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_builtin_abs_dtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_builtin_abs_dtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_builtin_abs_dtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_builtin_abs_dtype4, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_numpy_abs_dtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_numpy_abs_dtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_numpy_abs_dtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_numpy_abs_dtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestAbs::test_numpy_abs_dtype4, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_B_op0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_B_op1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_b_op0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_b_op1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_h_op0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_h_op1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_i_op0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_i_op1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_l_op0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestBitShifts::test_shift_all_bits_type_code_l_op1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_complex_hashes_type_code_D, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_complex_hashes_type_code_F, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_float_and_complex_hashes_type_code_D, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_float_and_complex_hashes_type_code_F, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_float_and_complex_hashes_type_code_d, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_float_and_complex_hashes_type_code_e, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_float_and_complex_hashes_type_code_f, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_integer_hashes_type_code_B, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_integer_hashes_type_code_b, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_integer_hashes_type_code_h, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_integer_hashes_type_code_i, test/torch_np/numpy_tests/core/test_scalarmath.py::TestHash::test_integer_hashes_type_code_l, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_B_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_B_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_b_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_b_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_h_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_h_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_i_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_i_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_l_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_divbyzero_dtype_l_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_B_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_B_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_B_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_b_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_b_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_b_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_h_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_h_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_h_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_i_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_i_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_i_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_l_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_l_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_integer_operation_overflow_dtype_l_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_b_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_b_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_b_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_b_operation3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_h_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_h_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_h_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_h_operation3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_i_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_i_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_i_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_i_operation3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_l_operation0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_l_operation1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_l_operation2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_signed_integer_overflow_dtype_l_operation3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarOpsMisc::test_scalar_unsigned_integer_overflow_dtype_B, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____add_____rop_____radd___op8_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____add_____rop_____radd___op8_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____add_____rop_____radd___op8_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____add_____rop_____radd___op8_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____eq_____rop_____eq___op2_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____eq_____rop_____eq___op2_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____eq_____rop_____eq___op2_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____eq_____rop_____eq___op2_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ge_____rop_____le___op5_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ge_____rop_____le___op5_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ge_____rop_____le___op5_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ge_____rop_____le___op5_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____gt_____rop_____lt___op4_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____gt_____rop_____lt___op4_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____gt_____rop_____lt___op4_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____gt_____rop_____lt___op4_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____le_____rop_____ge___op1_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____le_____rop_____ge___op1_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____le_____rop_____ge___op1_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____le_____rop_____ge___op1_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____lt_____rop_____gt___op0_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____lt_____rop_____gt___op0_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____lt_____rop_____gt___op0_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____lt_____rop_____gt___op0_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mod_____rop_____rmod___op9_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mod_____rop_____rmod___op9_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mod_____rop_____rmod___op9_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mod_____rop_____rmod___op9_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mul_____rop_____rmul___op10_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mul_____rop_____rmul___op10_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mul_____rop_____rmul___op10_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____mul_____rop_____rmul___op10_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ne_____rop_____ne___op3_cmp_True_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ne_____rop_____ne___op3_cmp_True_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ne_____rop_____ne___op3_cmp_True_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____ne_____rop_____ne___op3_cmp_True_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____pow_____rop_____rpow___op11_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____pow_____rop_____rpow___op11_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____pow_____rop_____rpow___op11_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____pow_____rop_____rpow___op11_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____sub_____rop_____rsub___op12_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____sub_____rop_____rsub___op12_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____sub_____rop_____rsub___op12_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____sub_____rop_____rsub___op12_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____truediv_____rop_____rtruediv___op7_cmp_False_subtype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____truediv_____rop_____rtruediv___op7_cmp_False_subtype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____truediv_____rop_____rtruediv___op7_cmp_False_subtype2, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_pyscalar_subclasses___op_____truediv_____rop_____rtruediv___op7_cmp_False_subtype3, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____add_____rop_____radd___op8_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____add_____rop_____radd___op8_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____eq_____rop_____eq___op2_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____eq_____rop_____eq___op2_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____floordiv_____rop_____rfloordiv___op6_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____ge_____rop_____le___op5_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____ge_____rop_____le___op5_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____gt_____rop_____lt___op4_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____gt_____rop_____lt___op4_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____le_____rop_____ge___op1_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____le_____rop_____ge___op1_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____lt_____rop_____gt___op0_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____lt_____rop_____gt___op0_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____mod_____rop_____rmod___op9_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____mod_____rop_____rmod___op9_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____mul_____rop_____rmul___op10_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____mul_____rop_____rmul___op10_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____ne_____rop_____ne___op3_cmp_True_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____ne_____rop_____ne___op3_cmp_True_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____pow_____rop_____rpow___op11_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____pow_____rop_____rpow___op11_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____sub_____rop_____rsub___op12_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____sub_____rop_____rsub___op12_cmp_False_sctype1, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____truediv_____rop_____rtruediv___op7_cmp_False_sctype0, test/torch_np/numpy_tests/core/test_scalarmath.py::TestScalarSubclassingMisc::test_subclass_deferral___op_____truediv_____rop_____rtruediv___op7_cmp_False_sctype1 2025-12-04T12:04:40.6538013Z 2025-12-04T12:04:40.6538383Z Finished torch_np/numpy_tests/core/test_scalarmath 1/1 ... [2025-12-04 12:04:40.636553][10359.534119823], took 0.42min 2025-12-04T12:04:40.6539745Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/torch_np.numpy_tests.core.test_scalarmath/torch_np.numpy_tests.core.test_scalarmath-4a75920c20f79d0b.xml 2025-12-04T12:04:40.7470587Z Running nn/test_convolution 2/2 ... [2025-12-04 12:04:40.746587][10359.644154158] 2025-12-04T12:04:40.7471095Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-12-04T12:04:40.7472989Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_convolution.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:04:40.746933] 2025-12-04T12:23:43.2137877Z 2025-12-04T12:23:43.2138800Z nn/test_convolution 2/2 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_convolution_2.2_bfa725818997bc8c_.log 2025-12-04T12:23:43.2300370Z Running 316 items in this shard: test/nn/test_convolution.py::TestConvolutionNN::test_Conv1d_module_same_padding, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_1x1, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_backward_twice, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_groups_nobias_v2, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_inconsistent_types, test/nn/test_convolution.py::TestConvolutionNN::test_Conv2d_module_same_padding, test/nn/test_convolution.py::TestConvolutionNN::test_Conv3d_groups_nobias, test/nn/test_convolution.py::TestConvolutionNN::test_Conv3d_module_same_padding, test/nn/test_convolution.py::TestConvolutionNN::test_ConvTranspose3d_correct_output_size, test/nn/test_convolution.py::TestConvolutionNN::test_conv1d_issue_120547, test/nn/test_convolution.py::TestConvolutionNN::test_conv_cudnn_memory_layout_dominance, test/nn/test_convolution.py::TestConvolutionNN::test_conv_shapecheck, test/nn/test_convolution.py::TestConvolutionNN::test_conv_tbc, test/nn/test_convolution.py::TestConvolutionNN::test_cudnn_non_contiguous, test/nn/test_convolution.py::TestConvolutionNN::test_cudnn_noncontiguous_weight, test/nn/test_convolution.py::TestConvolutionNN::test_cudnn_not_mutate_stride, test/nn/test_convolution.py::TestConvolutionNN::test_functional_grad_conv, test/nn/test_convolution.py::TestConvolutionNN::test_functional_grad_conv2d, test/nn/test_convolution.py::TestConvolutionNN::test_grad_conv1d_input, test/nn/test_convolution.py::TestConvolutionNN::test_grad_conv2d_input, test/nn/test_convolution.py::TestConvolutionNN::test_grad_conv3d_weight, test/nn/test_convolution.py::TestConvolutionNN::test_invalid_conv1d, test/nn/test_convolution.py::TestConvolutionNN::test_invalid_conv2d, test/nn/test_convolution.py::TestConvolutionNN::test_invalid_conv3d, test/nn/test_convolution.py::TestConvolutionNN::test_mismatch_shape_conv2d, test/nn/test_convolution.py::TestConvolutionNN::test_nnpack_conv, test/nn/test_convolution.py::TestConvolutionNN::test_thnn_conv_strided_padded_dilated, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_backward_depthwise_cuda_complex128, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_depthwise_naive_groups_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_depthwise_naive_groups_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_1_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_2_cuda_bfloat16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_2_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_3_cuda_bfloat16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_3_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_deterministic_cudnn_dilation_3_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_large_workspace_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_naive_groups_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_naive_groups_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv3d_depthwise_naive_groups_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv3d_depthwise_naive_groups_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_ConvTranspose2d_large_output_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_ConvTranspose2d_size_1_kernel_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_contig_wrong_stride_cudnn_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_same_padding_backward_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_same_padding_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_same_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_valid_padding_backward_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_valid_padding_backward_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_valid_padding_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_vs_scipy_mode_same_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv1d_vs_scipy_mode_valid_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_same_padding_backward_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_same_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_valid_padding_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_valid_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_vs_scipy_mode_same_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv2d_vs_scipy_mode_valid_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_64bit_indexing_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_cudnn_broken_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_same_padding_backward_cuda_complex128, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_same_padding_backward_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_same_padding_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_same_padding_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_vs_scipy_mode_same_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv3d_vs_scipy_mode_valid_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cuda_depthwise3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn1d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn2d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_cudnn3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_batch_channel3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_empty_channel3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen1d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen2d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen3d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise1d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_miopen_depthwise3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_cpu_input_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_cpu_input_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn1d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_cpu_input_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn2d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_cpu_input_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn3d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch3d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_batch_channel3d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_mkldnn_empty_channel3d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_dilated_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow1d_transposed_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_transposed_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_transposed_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_dilated_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow2d_transposed_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cpu_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_False_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_False_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_cuda_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_False_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_False_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_True_strided_False_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_True_strided_False_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_True_strided_True_contiguous_False_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_backend_slow3d_dilated_has_bias_True_strided_True_contiguous_True_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_ndhwc_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_nhwc_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_nhwc_cuda_float16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_cudnn_nhwc_support_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_double_backward_cuda_float64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_double_backward_groups_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_double_backward_no_bias_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_double_backward_stride_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_double_backward_strided_with_3D_input_and_weight_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_empty_channel_cuda_complex64, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_empty_channel_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_ic1_channels_last_for_oneDNN_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_large_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_large_nosplit_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_noncontig_weights_and_bias_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_noncontig_weights_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_thnn_nhwc_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_transpose_with_output_size_and_no_batch_dim_ConvTranspose2d_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_conv_transposed_large_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_convert_conv2d_weight_memory_format_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_cudnn_convolution_add_relu_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_cudnn_convolution_relu_cuda_float32, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_group_conv_empty_cuda, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_noncontig_conv_grad_cuda_bfloat16, test/nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_noncontig_conv_grad_cuda_float32 2025-12-04T12:23:43.2460373Z 2025-12-04T12:23:43.2460726Z Finished nn/test_convolution 2/2 ... [2025-12-04 12:23:43.214171][11502.111739545], took 19.04min 2025-12-04T12:23:43.2461874Z Parsing testcases for test report: /var/lib/jenkins/workspace/test/test-reports/python-pytest/nn.test_convolution/nn.test_convolution-8ad1e6986509c27f.xml 2025-12-04T12:23:44.4499228Z Uploading artifacts took 1.14 seconds 2025-12-04T12:23:48.2096593Z Running test batch 'tests to run' cost 10567.81 seconds 2025-12-04T12:23:48.9434159Z 2025-12-04T12:23:48.9434732Z real 176m13.787s 2025-12-04T12:23:48.9435004Z user 169m10.993s 2025-12-04T12:23:48.9435224Z sys 21m9.431s 2025-12-04T12:23:48.9435452Z + assert_git_not_dirty 2025-12-04T12:23:48.9435829Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck != *rocm* ]] 2025-12-04T12:23:48.9436323Z + [[ linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck != *xla* ]] 2025-12-04T12:23:48.9443267Z ++ git status --porcelain 2025-12-04T12:23:48.9443632Z ++ grep -v '?? third_party' 2025-12-04T12:23:53.2226995Z ++ true 2025-12-04T12:23:53.2227309Z + git_status= 2025-12-04T12:23:53.2227581Z + [[ -n '' ]] 2025-12-04T12:23:53.2227877Z + sccache_epilogue 2025-12-04T12:23:53.2228187Z + echo '::group::Sccache Compilation Log' 2025-12-04T12:23:53.2228810Z ##[group]Sccache Compilation Log 2025-12-04T12:23:53.2229159Z + echo '=================== sccache compilation log ===================' 2025-12-04T12:23:53.2229824Z =================== sccache compilation log =================== 2025-12-04T12:23:53.2233039Z + python /var/lib/jenkins/workspace/.ci/pytorch/print_sccache_log.py /var/lib/jenkins/sccache_error.log 2025-12-04T12:23:53.2387820Z + echo '=========== If your build fails, please take a look at the log above for possible reasons ===========' 2025-12-04T12:23:53.2388517Z =========== If your build fails, please take a look at the log above for possible reasons =========== 2025-12-04T12:23:53.2411801Z + sccache --show-stats 2025-12-04T12:23:53.2417661Z Compile requests 2835 2025-12-04T12:23:53.2418044Z Compile requests executed 268 2025-12-04T12:23:53.2418349Z Cache hits 129 2025-12-04T12:23:53.2418817Z Cache hits (C/C++) 129 2025-12-04T12:23:53.2419196Z Cache misses 139 2025-12-04T12:23:53.2419488Z Cache misses (C/C++) 139 2025-12-04T12:23:53.2419796Z Cache hits rate 48.13 % 2025-12-04T12:23:53.2420119Z Cache hits rate (C/C++) 48.13 % 2025-12-04T12:23:53.2420423Z Cache timeouts 0 2025-12-04T12:23:53.2420724Z Cache read errors 0 2025-12-04T12:23:53.2421031Z Forced recaches 0 2025-12-04T12:23:53.2421331Z Cache write errors 0 2025-12-04T12:23:53.2421635Z Cache errors 0 2025-12-04T12:23:53.2421934Z Compilations 139 2025-12-04T12:23:53.2422245Z Compilation failures 0 2025-12-04T12:23:53.2422563Z Non-cacheable compilations 0 2025-12-04T12:23:53.2422981Z Non-cacheable calls 141 2025-12-04T12:23:53.2423297Z Non-compilation calls 2426 2025-12-04T12:23:53.2423619Z Unsupported compiler calls 0 2025-12-04T12:23:53.2423945Z Average cache write 0.041 s 2025-12-04T12:23:53.2424263Z Average compiler 6.087 s 2025-12-04T12:23:53.2424766Z Average cache read hit 0.023 s 2025-12-04T12:23:53.2425092Z Failed distributed compilations 0 2025-12-04T12:23:53.2425310Z 2025-12-04T12:23:53.2425412Z Non-cacheable reasons: 2025-12-04T12:23:53.2425684Z unknown source language 108 2025-12-04T12:23:53.2425977Z -E 33 2025-12-04T12:23:53.2426180Z 2025-12-04T12:23:53.2426415Z Cache location s3, name: ossci-compiler-cache-circleci-v2, prefix: / 2025-12-04T12:23:53.2426856Z Version (client) 0.10.0 2025-12-04T12:23:53.2427156Z + sccache --stop-server 2025-12-04T12:23:53.2447876Z Stopping sccache server... 2025-12-04T12:23:53.2451214Z Compile requests 2835 2025-12-04T12:23:53.2451565Z Compile requests executed 268 2025-12-04T12:23:53.2451876Z Cache hits 129 2025-12-04T12:23:53.2452180Z Cache hits (C/C++) 129 2025-12-04T12:23:53.2452483Z Cache misses 139 2025-12-04T12:23:53.2452785Z Cache misses (C/C++) 139 2025-12-04T12:23:53.2453097Z Cache hits rate 48.13 % 2025-12-04T12:23:53.2453412Z Cache hits rate (C/C++) 48.13 % 2025-12-04T12:23:53.2453720Z Cache timeouts 0 2025-12-04T12:23:53.2454026Z Cache read errors 0 2025-12-04T12:23:53.2454335Z Forced recaches 0 2025-12-04T12:23:53.2454645Z Cache write errors 0 2025-12-04T12:23:53.2454938Z Cache errors 0 2025-12-04T12:23:53.2455233Z Compilations 139 2025-12-04T12:23:53.2455544Z Compilation failures 0 2025-12-04T12:23:53.2455926Z Non-cacheable compilations 0 2025-12-04T12:23:53.2456285Z Non-cacheable calls 141 2025-12-04T12:23:53.2456602Z Non-compilation calls 2426 2025-12-04T12:23:53.2457005Z Unsupported compiler calls 0 2025-12-04T12:23:53.2457343Z Average cache write 0.041 s 2025-12-04T12:23:53.2457856Z Average compiler 6.087 s 2025-12-04T12:23:53.2458172Z Average cache read hit 0.023 s 2025-12-04T12:23:53.2458589Z Failed distributed compilations 0 2025-12-04T12:23:53.2458812Z 2025-12-04T12:23:53.2458909Z Non-cacheable reasons: 2025-12-04T12:23:53.2459193Z unknown source language 108 2025-12-04T12:23:53.2459485Z -E 33 2025-12-04T12:23:53.2459697Z 2025-12-04T12:23:53.2459960Z Cache location s3, name: ossci-compiler-cache-circleci-v2, prefix: / 2025-12-04T12:23:53.2460410Z Version (client) 0.10.0 2025-12-04T12:23:53.2460705Z + echo ::endgroup:: 2025-12-04T12:23:53.2461184Z ##[endgroup] 2025-12-04T12:23:53.2461406Z + cleanup_workspace 2025-12-04T12:23:53.2461985Z + echo 'sudo may print the following warning message that can be ignored. The chown command will still run.' 2025-12-04T12:23:53.2462803Z sudo may print the following warning message that can be ignored. The chown command will still run. 2025-12-04T12:23:53.2463514Z + echo ' sudo: setrlimit(RLIMIT_STACK): Operation not permitted' 2025-12-04T12:23:53.2463974Z sudo: setrlimit(RLIMIT_STACK): Operation not permitted 2025-12-04T12:23:53.2464500Z + echo 'For more details refer to https://github.com/sudo-project/sudo/issues/42' 2025-12-04T12:23:53.2465071Z For more details refer to https://github.com/sudo-project/sudo/issues/42 2025-12-04T12:23:53.2465533Z + sudo chown -R 1000 /var/lib/jenkins/workspace 2025-12-04T12:23:54.3212016Z ##[group]Run pytorch/test-infra/.github/actions/upload-benchmark-results@main 2025-12-04T12:23:54.3212485Z with: 2025-12-04T12:23:54.3212742Z benchmark-results-dir: test/test-reports 2025-12-04T12:23:54.3213061Z dry-run: false 2025-12-04T12:23:54.3213291Z schema-version: v3 2025-12-04T12:23:54.3213725Z github-token: *** 2025-12-04T12:23:54.3213962Z env: 2025-12-04T12:23:54.3214172Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:23:54.3214434Z HAS_NVIDIA_GPU: true 2025-12-04T12:23:54.3224013Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:23:54.3224922Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:23:54.3225422Z ##[endgroup] 2025-12-04T12:23:54.3244114Z ##[group]Run set -eux 2025-12-04T12:23:54.3244397Z set -eux 2025-12-04T12:23:54.3244636Z  2025-12-04T12:23:54.3244860Z if [[ -n "" ]]; then 2025-12-04T12:23:54.3245129Z  source "" 2025-12-04T12:23:54.3245399Z fi 2025-12-04T12:23:54.3245747Z python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-12-04T12:23:54.3246166Z  2025-12-04T12:23:54.3246390Z DEVICE_NAME="" 2025-12-04T12:23:54.3246655Z DEVICE_TYPE="" 2025-12-04T12:23:54.3246899Z  2025-12-04T12:23:54.3247146Z if command -v nvidia-smi; then 2025-12-04T12:23:54.3247601Z  # NB: I'm using PyTorch here to get the device name, however, it needs to 2025-12-04T12:23:54.3248171Z  # install the correct version of PyTorch manually for now. Any PyTorch 2025-12-04T12:23:54.3248709Z  # version is fine, I just use 2.7.1 to satify PYPIDEP linter 2025-12-04T12:23:54.3249147Z  python3 -mpip install torch==2.7.1 2025-12-04T12:23:54.3249505Z elif command -v rocminfo; then 2025-12-04T12:23:54.3249942Z  # NB: Installing torch on ROCm runner with pip here causes CI to fail 2025-12-04T12:23:54.3250501Z  # with a memoryview is too large error only on MI300 runners. Is pip 2025-12-04T12:23:54.3251053Z  # version on ROCm runner there too old? As a workaround, let's use the 2025-12-04T12:23:54.3251538Z  # GPU device name coming from rocminfo instead 2025-12-04T12:23:54.3251903Z  DEVICE_NAME=rocm 2025-12-04T12:23:54.3252386Z  DEVICE_TYPE=$(rocminfo | grep "Marketing Name" | tail -n1 | awk -F':' '{print $2}' | xargs) 2025-12-04T12:23:54.3252867Z fi 2025-12-04T12:23:54.3253188Z  2025-12-04T12:23:54.3253467Z echo "DEVICE_NAME=$DEVICE_NAME" >> $GITHUB_ENV 2025-12-04T12:23:54.3253881Z echo "DEVICE_TYPE=$DEVICE_TYPE" >> $GITHUB_ENV 2025-12-04T12:23:54.3268825Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:23:54.3269193Z env: 2025-12-04T12:23:54.3269421Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:23:54.3269686Z HAS_NVIDIA_GPU: true 2025-12-04T12:23:54.3270013Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:23:54.3270570Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:23:54.3271052Z ##[endgroup] 2025-12-04T12:23:54.3313497Z + [[ -n '' ]] 2025-12-04T12:23:54.3313852Z + python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-12-04T12:23:54.5710141Z Defaulting to user installation because normal site-packages is not writeable 2025-12-04T12:23:55.8517841Z Collecting boto3==1.35.33 2025-12-04T12:23:55.8676764Z Downloading boto3-1.35.33-py3-none-any.whl (139 kB) 2025-12-04T12:23:56.2386512Z Collecting psutil==7.0.0 2025-12-04T12:23:56.2420486Z Downloading psutil-7.0.0-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (277 kB) 2025-12-04T12:23:56.2775666Z Collecting pynvml==12.0.0 2025-12-04T12:23:56.2807045Z Downloading pynvml-12.0.0-py3-none-any.whl (26 kB) 2025-12-04T12:23:56.3341924Z Collecting s3transfer<0.11.0,>=0.10.0 2025-12-04T12:23:56.3372442Z Downloading s3transfer-0.10.4-py3-none-any.whl (83 kB) 2025-12-04T12:23:57.7513057Z Collecting botocore<1.36.0,>=1.35.33 2025-12-04T12:23:57.7548921Z Downloading botocore-1.35.99-py3-none-any.whl (13.3 MB) 2025-12-04T12:23:57.9010618Z Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /usr/lib/python3.9/site-packages (from boto3==1.35.33) (0.10.0) 2025-12-04T12:23:57.9535086Z Collecting nvidia-ml-py<13.0.0a0,>=12.0.0 2025-12-04T12:23:57.9565066Z Downloading nvidia_ml_py-12.575.51-py3-none-any.whl (47 kB) 2025-12-04T12:23:57.9668422Z Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/lib/python3.9/site-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (2.8.1) 2025-12-04T12:23:57.9678038Z Requirement already satisfied: urllib3<1.27,>=1.25.4 in /usr/lib/python3.9/site-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.25.10) 2025-12-04T12:23:58.1753697Z Requirement already satisfied: six>=1.5 in /usr/lib/python3.9/site-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.15.0) 2025-12-04T12:23:58.3142094Z Installing collected packages: botocore, s3transfer, nvidia-ml-py, pynvml, psutil, boto3 2025-12-04T12:23:58.8862088Z Attempting uninstall: nvidia-ml-py 2025-12-04T12:23:58.8863141Z Found existing installation: nvidia-ml-py 11.525.84 2025-12-04T12:23:58.8878473Z Uninstalling nvidia-ml-py-11.525.84: 2025-12-04T12:23:58.9134889Z Successfully uninstalled nvidia-ml-py-11.525.84 2025-12-04T12:23:58.9746532Z Attempting uninstall: psutil 2025-12-04T12:23:58.9747206Z Found existing installation: psutil 5.9.8 2025-12-04T12:23:58.9836581Z Uninstalling psutil-5.9.8: 2025-12-04T12:23:58.9844019Z Successfully uninstalled psutil-5.9.8 2025-12-04T12:23:59.1555549Z Successfully installed boto3-1.35.33 botocore-1.35.99 nvidia-ml-py-12.575.51 psutil-7.0.0 pynvml-12.0.0 s3transfer-0.10.4 2025-12-04T12:23:59.2857341Z + DEVICE_NAME= 2025-12-04T12:23:59.2857585Z + DEVICE_TYPE= 2025-12-04T12:23:59.2857825Z /usr/bin/nvidia-smi 2025-12-04T12:23:59.2858083Z + command -v nvidia-smi 2025-12-04T12:23:59.2858367Z + python3 -mpip install torch==2.7.1 2025-12-04T12:23:59.5248034Z Defaulting to user installation because normal site-packages is not writeable 2025-12-04T12:23:59.8304671Z Collecting torch==2.7.1 2025-12-04T12:23:59.8467024Z Downloading torch-2.7.1-cp39-cp39-manylinux_2_28_x86_64.whl (821.1 MB) 2025-12-04T12:24:12.9510409Z Collecting nvidia-cuda-nvrtc-cu12==12.6.77 2025-12-04T12:24:12.9542971Z Downloading nvidia_cuda_nvrtc_cu12-12.6.77-py3-none-manylinux2014_x86_64.whl (23.7 MB) 2025-12-04T12:24:13.2368152Z Collecting networkx 2025-12-04T12:24:13.2399900Z Downloading networkx-3.2.1-py3-none-any.whl (1.6 MB) 2025-12-04T12:24:13.2938251Z Collecting nvidia-cusolver-cu12==11.7.1.2 2025-12-04T12:24:13.2971888Z Downloading nvidia_cusolver_cu12-11.7.1.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (158.2 MB) 2025-12-04T12:24:14.8713848Z Collecting nvidia-curand-cu12==10.3.7.77 2025-12-04T12:24:14.8754625Z Downloading nvidia_curand_cu12-10.3.7.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (56.3 MB) 2025-12-04T12:24:15.3817530Z Requirement already satisfied: jinja2 in /usr/lib/python3.9/site-packages (from torch==2.7.1) (2.11.3) 2025-12-04T12:24:15.4002337Z Collecting nvidia-cufile-cu12==1.11.1.6 2025-12-04T12:24:15.4034919Z Downloading nvidia_cufile_cu12-1.11.1.6-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.1 MB) 2025-12-04T12:24:15.4211472Z Requirement already satisfied: typing-extensions>=4.10.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from torch==2.7.1) (4.15.0) 2025-12-04T12:24:15.4895436Z Collecting fsspec 2025-12-04T12:24:15.4925561Z Downloading fsspec-2025.10.0-py3-none-any.whl (200 kB) 2025-12-04T12:24:15.5489306Z Collecting triton==3.3.1 2025-12-04T12:24:15.5521240Z Downloading triton-3.3.1-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (155.6 MB) 2025-12-04T12:24:17.0731711Z Collecting nvidia-cudnn-cu12==9.5.1.17 2025-12-04T12:24:17.0788839Z Downloading nvidia_cudnn_cu12-9.5.1.17-py3-none-manylinux_2_28_x86_64.whl (571.0 MB) 2025-12-04T12:24:25.1037495Z Collecting nvidia-cuda-runtime-cu12==12.6.77 2025-12-04T12:24:25.1071935Z Downloading nvidia_cuda_runtime_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (897 kB) 2025-12-04T12:24:25.1529870Z Collecting nvidia-nvtx-cu12==12.6.77 2025-12-04T12:24:25.1562153Z Downloading nvidia_nvtx_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89 kB) 2025-12-04T12:24:25.2168674Z Collecting filelock 2025-12-04T12:24:25.2201605Z Downloading filelock-3.19.1-py3-none-any.whl (15 kB) 2025-12-04T12:24:25.2557678Z Collecting nvidia-nvjitlink-cu12==12.6.85 2025-12-04T12:24:25.2599227Z Downloading nvidia_nvjitlink_cu12-12.6.85-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (19.7 MB) 2025-12-04T12:24:25.4984926Z Collecting nvidia-cublas-cu12==12.6.4.1 2025-12-04T12:24:25.5022530Z Downloading nvidia_cublas_cu12-12.6.4.1-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (393.1 MB) 2025-12-04T12:24:30.7664661Z Collecting nvidia-cufft-cu12==11.3.0.4 2025-12-04T12:24:30.7723913Z Downloading nvidia_cufft_cu12-11.3.0.4-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (200.2 MB) 2025-12-04T12:24:33.0460271Z Collecting nvidia-cuda-cupti-cu12==12.6.80 2025-12-04T12:24:33.0492523Z Downloading nvidia_cuda_cupti_cu12-12.6.80-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (8.9 MB) 2025-12-04T12:24:33.1509310Z Collecting nvidia-cusparselt-cu12==0.6.3 2025-12-04T12:24:33.1540098Z Downloading nvidia_cusparselt_cu12-0.6.3-py3-none-manylinux2014_x86_64.whl (156.8 MB) 2025-12-04T12:24:34.7185621Z Collecting nvidia-cusparse-cu12==12.5.4.2 2025-12-04T12:24:34.7241200Z Downloading nvidia_cusparse_cu12-12.5.4.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (216.6 MB) 2025-12-04T12:24:37.1310653Z Collecting nvidia-nccl-cu12==2.26.2 2025-12-04T12:24:37.1374672Z Downloading nvidia_nccl_cu12-2.26.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (201.3 MB) 2025-12-04T12:24:39.3969459Z Collecting sympy>=1.13.3 2025-12-04T12:24:39.4007019Z Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB) 2025-12-04T12:24:39.4888842Z Requirement already satisfied: setuptools>=40.8.0 in /usr/lib/python3.9/site-packages (from triton==3.3.1->torch==2.7.1) (59.6.0) 2025-12-04T12:24:39.5213665Z Collecting mpmath<1.4,>=1.1.0 2025-12-04T12:24:39.5246920Z Downloading mpmath-1.3.0-py3-none-any.whl (536 kB) 2025-12-04T12:24:39.6189967Z Requirement already satisfied: MarkupSafe>=0.23 in /usr/lib64/python3.9/site-packages (from jinja2->torch==2.7.1) (1.1.1) 2025-12-04T12:24:39.9777147Z Installing collected packages: nvidia-nvjitlink-cu12, nvidia-cusparse-cu12, nvidia-cublas-cu12, mpmath, triton, sympy, nvidia-nvtx-cu12, nvidia-nccl-cu12, nvidia-cusparselt-cu12, nvidia-cusolver-cu12, nvidia-curand-cu12, nvidia-cufile-cu12, nvidia-cufft-cu12, nvidia-cudnn-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, networkx, fsspec, filelock, torch 2025-12-04T12:24:49.3168163Z WARNING: The scripts proton and proton-viewer are installed in '/home/ec2-user/.local/bin' which is not on PATH. 2025-12-04T12:24:49.3169028Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-12-04T12:24:53.4614137Z WARNING: The script isympy is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2025-12-04T12:24:53.4615256Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-12-04T12:25:24.9611159Z WARNING: The scripts torchfrtrace and torchrun are installed in '/home/ec2-user/.local/bin' which is not on PATH. 2025-12-04T12:25:24.9612045Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-12-04T12:25:25.0590879Z Successfully installed filelock-3.19.1 fsspec-2025.10.0 mpmath-1.3.0 networkx-3.2.1 nvidia-cublas-cu12-12.6.4.1 nvidia-cuda-cupti-cu12-12.6.80 nvidia-cuda-nvrtc-cu12-12.6.77 nvidia-cuda-runtime-cu12-12.6.77 nvidia-cudnn-cu12-9.5.1.17 nvidia-cufft-cu12-11.3.0.4 nvidia-cufile-cu12-1.11.1.6 nvidia-curand-cu12-10.3.7.77 nvidia-cusolver-cu12-11.7.1.2 nvidia-cusparse-cu12-12.5.4.2 nvidia-cusparselt-cu12-0.6.3 nvidia-nccl-cu12-2.26.2 nvidia-nvjitlink-cu12-12.6.85 nvidia-nvtx-cu12-12.6.77 sympy-1.14.0 torch-2.7.1 triton-3.3.1 2025-12-04T12:25:25.7583988Z + echo DEVICE_NAME= 2025-12-04T12:25:25.7584503Z + echo DEVICE_TYPE= 2025-12-04T12:25:25.7613190Z ##[group]Run set -eux 2025-12-04T12:25:25.7613482Z set -eux 2025-12-04T12:25:25.7613712Z  2025-12-04T12:25:25.7613956Z if [[ -z "${GITHUB_TOKEN}" ]]; then 2025-12-04T12:25:25.7614315Z  echo "Missing github-token input" 2025-12-04T12:25:25.7614636Z  exit 1 2025-12-04T12:25:25.7614859Z fi 2025-12-04T12:25:25.7628656Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:25.7629042Z env: 2025-12-04T12:25:25.7629258Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:25.7629540Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:25.7629859Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:25.7630408Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:25.7630893Z DEVICE_NAME: 2025-12-04T12:25:25.7631112Z DEVICE_TYPE: 2025-12-04T12:25:25.7631557Z GITHUB_TOKEN: *** 2025-12-04T12:25:25.7631792Z ##[endgroup] 2025-12-04T12:25:25.7674155Z + [[ -z *** ]] 2025-12-04T12:25:25.7729000Z ##[group]Run pytorch/test-infra/.github/actions/get-workflow-job-id@main 2025-12-04T12:25:25.7729437Z with: 2025-12-04T12:25:25.7729782Z github-token: *** 2025-12-04T12:25:25.7730018Z env: 2025-12-04T12:25:25.7730240Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:25.7730504Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:25.7730824Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:25.7731372Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:25.7731861Z DEVICE_NAME: 2025-12-04T12:25:25.7732086Z DEVICE_TYPE: 2025-12-04T12:25:25.7732313Z ##[endgroup] 2025-12-04T12:25:25.7748508Z ##[group]Run set -eux 2025-12-04T12:25:25.7748768Z set -eux 2025-12-04T12:25:25.7748996Z  2025-12-04T12:25:25.7749468Z python3 "${GITHUB_ACTION_PATH}/../../scripts/get_workflow_job_id.py" "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-12-04T12:25:25.7758887Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:25.7759255Z env: 2025-12-04T12:25:25.7759592Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:25.7759864Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:25.7760194Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:25.7760746Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:25.7761231Z DEVICE_NAME: 2025-12-04T12:25:25.7761455Z DEVICE_TYPE: 2025-12-04T12:25:25.7761986Z GITHUB_TOKEN: *** 2025-12-04T12:25:25.7762225Z ##[endgroup] 2025-12-04T12:25:25.7793232Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/get-workflow-job-id/../../scripts/get_workflow_job_id.py 19922826259 i-04e943c8c4e7268ee 2025-12-04T12:25:29.1679901Z setting job-id=57118183206 2025-12-04T12:25:29.1680670Z setting job-name=linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T12:25:29.1805931Z ##[group]Run set -eux 2025-12-04T12:25:29.1806207Z set -eux 2025-12-04T12:25:29.1806442Z  2025-12-04T12:25:29.1806659Z if [[ -n "" ]]; then 2025-12-04T12:25:29.1806935Z  source "" 2025-12-04T12:25:29.1807170Z fi 2025-12-04T12:25:29.1807393Z  2025-12-04T12:25:29.1807782Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_metadata.py" \ 2025-12-04T12:25:29.1808292Z  --schema-version "${SCHEMA_VERSION}" \ 2025-12-04T12:25:29.1808828Z  --repo "${REPO}" \ 2025-12-04T12:25:29.1809140Z  --head-branch "${HEAD_BRANCH}" \ 2025-12-04T12:25:29.1809470Z  --head-sha "${HEAD_SHA}" \ 2025-12-04T12:25:29.1809813Z  --workflow-id "${WORKFLOW_RUN_ID}" \ 2025-12-04T12:25:29.1810178Z  --run-attempt "${RUN_ATTEMPT}" \ 2025-12-04T12:25:29.1810498Z  --job-id "${JOB_ID}" \ 2025-12-04T12:25:29.1810807Z  --job-name "${JOB_NAME}" 2025-12-04T12:25:29.1821248Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:29.1821622Z env: 2025-12-04T12:25:29.1821838Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:29.1822107Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:29.1822426Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:29.1823091Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:29.1823580Z DEVICE_NAME: 2025-12-04T12:25:29.1823809Z DEVICE_TYPE: 2025-12-04T12:25:29.1824035Z SCHEMA_VERSION: v3 2025-12-04T12:25:29.1824647Z REPO: pytorch/pytorch 2025-12-04T12:25:29.1825001Z HEAD_BRANCH: refs/heads/main 2025-12-04T12:25:29.1825419Z HEAD_SHA: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T12:25:29.1825852Z WORKFLOW_RUN_ID: 19922826259 2025-12-04T12:25:29.1826170Z RUN_ATTEMPT: 1 2025-12-04T12:25:29.1826393Z JOB_ID: 57118183206 2025-12-04T12:25:29.1827044Z JOB_NAME: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T12:25:29.1827736Z ##[endgroup] 2025-12-04T12:25:29.1859624Z + [[ -n '' ]] 2025-12-04T12:25:29.1861779Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_metadata.py --schema-version v3 --repo pytorch/pytorch --head-branch refs/heads/main --head-sha ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 --workflow-id 19922826259 --run-attempt 1 --job-id 57118183206 --job-name 'linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check)' 2025-12-04T12:25:29.2353862Z ##[group]Run set -eux 2025-12-04T12:25:29.2354137Z set -eux 2025-12-04T12:25:29.2354362Z  2025-12-04T12:25:29.2354588Z if [[ -n "" ]]; then 2025-12-04T12:25:29.2354867Z  source "" 2025-12-04T12:25:29.2355107Z fi 2025-12-04T12:25:29.2355323Z  2025-12-04T12:25:29.2355712Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_runners_info.py" 2025-12-04T12:25:29.2365047Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:29.2365412Z env: 2025-12-04T12:25:29.2365629Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:29.2365894Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:29.2366216Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:29.2366769Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:29.2367272Z DEVICE_NAME: 2025-12-04T12:25:29.2367501Z DEVICE_TYPE: 2025-12-04T12:25:29.2367733Z ##[endgroup] 2025-12-04T12:25:29.2398633Z + [[ -n '' ]] 2025-12-04T12:25:29.2399426Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_runners_info.py 2025-12-04T12:25:30.2266267Z /home/ec2-user/.local/lib/python3.9/site-packages/torch/_subclasses/functional_tensor.py:276: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:81.) 2025-12-04T12:25:30.2268565Z cpu = _conversion_method_template(device=torch.device("cpu")) 2025-12-04T12:25:31.2486945Z ##[group]Run set -eux 2025-12-04T12:25:31.2487215Z set -eux 2025-12-04T12:25:31.2487445Z  2025-12-04T12:25:31.2487701Z # TODO (huydhn): Implement this part 2025-12-04T12:25:31.2488084Z echo "dependencies={}" >> "${GITHUB_OUTPUT}" 2025-12-04T12:25:31.2497419Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:31.2497796Z env: 2025-12-04T12:25:31.2498017Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:31.2498286Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:31.2498608Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:31.2499156Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:31.2499644Z DEVICE_NAME: 2025-12-04T12:25:31.2499875Z DEVICE_TYPE: 2025-12-04T12:25:31.2500123Z ##[endgroup] 2025-12-04T12:25:31.2532670Z + echo 'dependencies={}' 2025-12-04T12:25:31.2587124Z ##[group]Run set -eux 2025-12-04T12:25:31.2587407Z set -eux 2025-12-04T12:25:31.2587642Z  2025-12-04T12:25:31.2587859Z if [[ -n "" ]]; then 2025-12-04T12:25:31.2588128Z  source "" 2025-12-04T12:25:31.2588362Z fi 2025-12-04T12:25:31.2588573Z  2025-12-04T12:25:31.2588842Z if [[ ! -d "${BENCHMARK_RESULTS_DIR}" ]]; then 2025-12-04T12:25:31.2589292Z  echo "${BENCHMARK_RESULTS_DIR} does not exist, skipping" 2025-12-04T12:25:31.2589821Z  # We don't want the job to fail if the directory doesn't exist 2025-12-04T12:25:31.2590206Z  exit 0 2025-12-04T12:25:31.2590423Z fi 2025-12-04T12:25:31.2590638Z  2025-12-04T12:25:31.2590880Z if [[ "${DRY_RUN}" == "true" ]]; then 2025-12-04T12:25:31.2591349Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-12-04T12:25:31.2591898Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-12-04T12:25:31.2592321Z  --metadata "${BENCHMARK_METADATA}" \ 2025-12-04T12:25:31.2592673Z  --runners "${RUNNER_INFO}" \ 2025-12-04T12:25:31.2593015Z  --dependencies "${DEPENDENCIES}" \ 2025-12-04T12:25:31.2593341Z  --dry-run 2025-12-04T12:25:31.2593587Z else 2025-12-04T12:25:31.2593967Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-12-04T12:25:31.2594501Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-12-04T12:25:31.2594921Z  --metadata "${BENCHMARK_METADATA}" \ 2025-12-04T12:25:31.2595279Z  --runners "${RUNNER_INFO}" \ 2025-12-04T12:25:31.2595617Z  --dependencies "${DEPENDENCIES}" 2025-12-04T12:25:31.2595938Z fi 2025-12-04T12:25:31.2604920Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:31.2605377Z env: 2025-12-04T12:25:31.2605596Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:31.2605862Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:31.2606171Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:31.2606711Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:31.2607186Z DEVICE_NAME: 2025-12-04T12:25:31.2607407Z DEVICE_TYPE: 2025-12-04T12:25:31.2607657Z BENCHMARK_RESULTS_DIR: test/test-reports 2025-12-04T12:25:31.2607975Z DRY_RUN: false 2025-12-04T12:25:31.2609435Z BENCHMARK_METADATA: {"timestamp": 1764851129, "schema_version": "v3", "name": "linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32", "workflow_id": 19922826259, "run_attempt": 1, "job_id": 57118183206} 2025-12-04T12:25:31.2611506Z RUNNER_INFO: [{"cpu_info": "x86_64", "cpu_count": 16, "avail_mem_in_gb": 62, "extra_info": {"hostname": "ip-10-0-73-227.ec2.internal"}, "name": "cuda", "type": "NVIDIA A10G", "gpu_count": 1, "avail_gpu_mem_in_gb": 22}] 2025-12-04T12:25:31.2612265Z DEPENDENCIES: {} 2025-12-04T12:25:31.2612492Z ##[endgroup] 2025-12-04T12:25:31.2641708Z + [[ -n '' ]] 2025-12-04T12:25:31.2642277Z + [[ ! -d test/test-reports ]] 2025-12-04T12:25:31.2642701Z + [[ false == \t\r\u\e ]] 2025-12-04T12:25:31.2647392Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py --benchmark-results-dir test/test-reports --metadata '{"timestamp": 1764851129, "schema_version": "v3", "name": "linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32", "workflow_id": 19922826259, "run_attempt": 1, "job_id": 57118183206}' --runners '[{"cpu_info": "x86_64", "cpu_count": 16, "avail_mem_in_gb": 62, "extra_info": {"hostname": "ip-10-0-73-227.ec2.internal"}, "name": "cuda", "type": "NVIDIA A10G", "gpu_count": 1, "avail_gpu_mem_in_gb": 22}]' --dependencies '{}' 2025-12-04T12:25:31.4145266Z /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py:236: UserWarning: {'included': [{'test_file': 'inductor/test_aot_inductor'}, {'test_file': 'inductor/test_torchinductor'}, {'test_file': 'inductor/test_torchinductor_dynamic_shapes'}, {'test_file': 'inductor/test_torchinductor_codegen_dynamic_shapes'}, {'test_file': 'inductor/test_kernel_benchmark'}, {'test_file': 'inductor/test_torchinductor_opinfo'}, {'test_file': 'inductor/test_pattern_matcher'}, {'test_file': 'inductor/test_cuda_repro'}, {'test_file': 'inductor/test_cudagraph_trees'}, {'test_file': 'dynamo/test_activation_checkpointing'}, {'test_file': 'dynamo/test_logging'}, {'test_file': 'dynamo/test_repros'}, {'test_file': 'inductor/test_flex_attention'}, {'test_file': 'inductor/test_cuda_select_algorithm'}, {'test_file': 'inductor/test_compile_subprocess'}, {'test_file': 'inductor/test_flex_decoding'}, {'test_file': 'inductor/test_deterministic'}, {'test_file': 'export/test_retraceability'}, {'test_file': 'inductor/test_fp8'}, {'test_file': 'dynamo/test_model_output'}, {'test_file': 'inductor/test_triton_kernels'}, {'test_file': 'inductor/test_extension_backend'}, {'test_file': 'inductor/test_native_matmul'}, {'test_file': 'inductor/test_loop_ordering'}, {'test_file': 'export/test_serdes'}, {'test_file': 'dynamo/test_regional_inductor'}, {'test_file': 'dynamo/test_fx_graph_runnable'}, {'test_file': 'dynamo/test_backends'}, {'test_file': 'inductor/test_aot_inductor_package'}, {'test_file': 'inductor/test_decompose_mem_bound_mm'}, {'test_file': 'inductor/test_op_dtype_prop'}, {'test_file': 'inductor/test_online_softmax'}, {'test_file': 'inductor/test_memory'}, {'test_file': 'dynamo/test_streams'}, {'test_file': 'inductor/test_unbacked_symints'}, {'test_file': 'inductor/test_scatter_optimization'}, {'test_file': 'inductor/test_mix_order_reduction'}, {'test_file': 'inductor/test_padding'}, {'test_file': 'dynamo/test_aot_compile'}, {'test_file': 'dynamo/test_sets'}, {'test_file': 'dynamo/test_wrap_inductor_compiled_regions'}, {'test_file': 'dynamo/test_callback'}, {'test_file': 'dynamo/test_compiler_bisector'}, {'test_file': 'inductor/test_custom_op_autotune'}, {'test_file': 'inductor/test_cudagraph_trees_expandable_segments'}, {'test_file': 'dynamo/test_decorators'}, {'test_file': 'test_privateuseone_python_backend'}, {'test_file': 'inductor/test_collective_autotuning'}, {'test_file': 'test_varlen_attention'}, {'test_file': 'test_cuda'}, {'test_file': 'test_transformers'}, {'test_file': 'test_autograd'}, {'test_file': 'test_sparse'}, {'test_file': 'higher_order_ops/test_local_map'}, {'test_file': 'test_dataloader'}, {'test_file': 'higher_order_ops/test_invoke_subgraph'}, {'test_file': 'test_ci_sanity_check_fail'}, {'test_file': 'test_ops_fwd_gradients'}, {'test_file': 'test_ops_gradients'}, {'test_file': 'test_nestedtensor'}, {'test_file': 'test_linalg'}, {'test_file': 'test_cuda_expandable_segments'}, {'test_file': 'test_public_bindings'}, {'test_file': 'functorch/test_dims'}, {'test_file': 'test_sparse_csr'}, {'test_file': 'functorch/test_ops'}, {'test_file': 'functorch/test_vmap'}, {'test_file': 'test_overrides'}, {'test_file': 'test_torchfuzz_repros'}, {'test_file': 'inductor/test_group_batch_fusion'}, {'test_file': 'dynamo/test_dynamic_shapes'}, {'test_file': 'inductor/test_cpu_repro'}, {'test_file': 'dynamo/test_after_aot'}, {'test_file': 'inductor/test_snode_runtime'}, {'test_file': 'inductor/test_minifier'}, {'test_file': 'inductor/test_compiled_autograd'}, {'test_file': 'inductor/test_custom_lowering'}, {'test_file': 'inductor/test_perf'}, {'test_file': 'inductor/test_fused_attention'}, {'test_file': 'inductor/test_binary_folding'}, {'test_file': 'inductor/test_mkldnn_pattern_matcher'}, {'test_file': 'inductor/test_inductor_freezing'}, {'test_file': 'inductor/test_layout_optim'}, {'test_file': 'dynamo/test_unspec'}, {'test_file': 'dynamo/test_higher_order_ops'}, {'test_file': 'inductor/test_mmdecomp'}, {'test_file': 'dynamo/test_ctx_manager'}, {'test_file': 'dynamo/test_exc'}, {'test_file': 'dynamo/test_misc'}, {'test_file': 'inductor/test_cpu_select_algorithm'}, {'test_file': 'inductor/test_aot_inductor_arrayref'}, {'test_file': 'inductor/test_cpu_cpp_wrapper'}, {'test_file': 'inductor/test_triton_cpu_backend'}, {'test_file': 'inductor/test_torchinductor_strided_blocks'}, {'test_file': 'test_custom_ops'}, {'test_file': 'test_content_store'}, {'test_file': 'inductor/test_halide'}, {'test_file': 'inductor/test_multi_kernel'}, {'test_file': 'inductor/test_analysis'}, {'test_file': 'inductor/test_pad_mm'}, {'test_file': 'inductor/test_triton_syntax'}, {'test_file': 'inductor/test_triton_extension_backend'}, {'test_file': 'test_sparse_semi_structured'}, {'test_file': 'inductor/test_op_completeness'}, {'test_file': 'inductor/test_subgraph_choice'}, {'test_file': 'inductor/test_b2b_gemm'}, {'test_file': 'inductor/test_triton_heuristics'}, {'test_file': 'inductor/test_cutedsl_grouped_mm'}, {'test_file': 'inductor/test_cpp_wrapper_hipify'}, {'test_file': 'inductor/test_ck_backend'}, {'test_file': 'inductor/test_inductor_utils'}, {'test_file': 'inductor/test_template_heuristics_registry'}, {'test_file': 'inductor/test_async_compile'}, {'test_file': 'inductor/test_gpu_cpp_wrapper'}, {'test_file': 'export/test_export_training_ir_to_run_decomp'}, {'test_file': 'dynamo/test_deque_reconstruct'}, {'test_file': 'inductor/test_utils'}, {'test_file': 'inductor/test_indexing'}, {'test_file': 'inductor/test_inductor_annotations'}, {'test_file': 'inductor/test_compile_worker'}, {'test_file': 'dynamo/test_einops'}, {'test_file': 'inductor/test_external_callables'}, {'test_file': 'test_testing'}, {'test_file': 'dynamo/test_fx_passes_pre_grad'}, {'test_file': 'inductor/test_autoheuristic'}, {'test_file': 'export/test_strict_export_v2'}, {'test_file': 'inductor/test_flex_flash'}, {'test_file': 'inductor/test_segmented_tree'}, {'test_file': 'inductor/test_kernel_optimization'}, {'test_file': 'inductor/test_metrics'}, {'test_file': 'export/test_unflatten_training_ir'}, {'test_file': 'inductor/test_fx_fusion'}, {'test_file': 'inductor/test_xpu_basic'}, {'test_file': 'dynamo/test_inline_and_install'}, {'test_file': 'export/test_functionalized_assertions'}, {'test_file': 'inductor/test_selective_lowering'}, {'test_file': 'dynamo/test_base_output'}, {'test_file': 'inductor/test_lookup_table'}, {'test_file': 'inductor/test_cooperative_reductions'}, {'test_file': 'export/test_serialize'}, {'test_file': 'inductor/test_cutedsl_template'}, {'test_file': 'inductor/test_benchmark_fusion'}, {'test_file': 'inductor/test_inductor_scheduler'}, {'test_file': 'inductor/test_move_constructors_to_gpu'}, {'test_file': 'export/test_export_strict'}, {'test_file': 'dynamo/test_modules'}, {'test_file': 'inductor/test_remote_cache'}, {'test_file': 'inductor/test_coordinate_descent_tuner'}, {'test_file': 'inductor/test_inplace_padding'}, {'test_file': 'inductor/test_cudacodecache'}, {'test_file': 'inductor/test_minifier_utils'}, {'test_file': 'inductor/test_debug_trace'}, {'test_file': 'dynamo/test_recompiles'}, {'test_file': 'inductor/test_foreach'}, {'test_file': 'export/test_tree_utils'}, {'test_file': 'inductor/test_triton_wrapper'}, {'test_file': 'inductor/test_static_cuda_launcher'}, {'test_file': 'export/test_dynamic_shapes'}, {'test_file': 'dynamo/test_sdpa'}, {'test_file': 'dynamo/test_utils'}, {'test_file': 'inductor/test_provenance_tracing'}, {'test_file': 'inductor/test_combo_kernels'}, {'test_file': 'inductor/test_codegen_triton'}, {'test_file': 'dynamo/test_frame_init'}, {'test_file': 'inductor/test_device_assert'}, {'test_file': 'dynamo/test_skip_non_tensor'}, {'test_file': 'dynamo/test_skip_guard_eval_unsafe'}, {'test_file': 'dynamo/test_interop'}, {'test_file': 'functorch/test_eager_transforms'}, {'test_file': 'inductor/test_control_deps'}, {'test_file': 'inductor/test_benchmarking'}, {'test_file': 'inductor/test_helion_kernels'}, {'test_file': 'inductor/test_quantization'}, {'test_file': 'inductor/test_best_config'}, {'test_file': 'export/test_tools'}, {'test_file': 'dynamo/test_buffers_override'}, {'test_file': 'inductor/test_inplacing_pass'}, {'test_file': 'inductor/test_aot_inductor_custom_ops'}, {'test_file': 'inductor/test_split_cat_fx_passes'}, {'test_file': 'inductor/test_profiler'}, {'test_file': 'inductor/test_memory_planning'}, {'test_file': 'inductor/test_mem_estimation'}, {'test_file': 'dynamo/test_view'}, {'test_file': 'inductor/test_cutlass_evt'}, {'test_file': 'dynamo/test_reconstruct'}, {'test_file': 'dynamo/test_aot_autograd'}, {'test_file': 'export/test_cpp_serdes'}, {'test_file': 'inductor/test_cache'}, {'test_file': 'inductor/test_block_analysis'}, {'test_file': 'dynamo/test_subgraphs'}, {'test_file': 'dynamo/test_pre_dispatch'}, {'test_file': 'inductor/test_custom_post_grad_passes'}, {'test_file': 'dynamo/test_fx_annotate'}, {'test_file': 'dynamo/test_pgo'}, {'test_file': 'dynamo/test_config'}, {'test_file': 'dynamo/test_metrics_context'}, {'test_file': 'export/test_package'}, {'test_file': 'export/test_export_opinfo'}, {'test_file': 'dynamo/test_nops'}, {'test_file': 'inductor/test_graph_transform_observer'}, {'test_file': 'inductor/test_aot_inductor_utils'}, {'test_file': 'export/test_db'}, {'test_file': 'dynamo/test_export_mutations'}, {'test_file': 'inductor/test_config'}, {'test_file': 'inductor/test_dependencies'}, {'test_file': 'inductor/test_fuzzer'}, {'test_file': 'dynamo/test_global'}, {'test_file': 'inductor/test_control_flow'}, {'test_file': 'dynamo/test_graph_region_tracker'}, {'test_file': 'dynamo/test_unittest'}, {'test_file': 'inductor/test_compile'}, {'test_file': 'dynamo/test_functions'}, {'test_file': 'inductor/test_ordered_set'}, {'test_file': 'inductor/test_pallas'}, {'test_file': 'dynamo/test_install_free_tensors'}, {'test_file': 'inductor/test_torchinductor_codegen_config_overrides'}, {'test_file': 'export/test_passes'}, {'test_file': 'dynamo/test_autograd_function'}, {'test_file': 'inductor/test_codecache'}, {'test_file': 'dynamo/test_cudagraphs'}, {'test_file': 'inductor/test_alignment'}, {'test_file': 'dynamo/test_profiler'}, {'test_file': 'dynamo/test_guard_serialization'}, {'test_file': 'dynamo/test_compile'}, {'test_file': 'dynamo/test_nested_graph_breaks'}, {'test_file': 'dynamo/test_dicts'}, {'test_file': 'inductor/test_needs_exact_strides'}, {'test_file': 'inductor/test_auto_functionalize'}, {'test_file': 'inductor/test_split_cat_fx_aten_passes'}, {'test_file': 'inductor/test_minifier_isolate'}, {'test_file': 'dynamo/test_list'}, {'test_file': 'dynamo/test_resume'}, {'test_file': 'inductor/test_augmented_graph_helper'}, {'test_file': 'dynamo/test_deviceguard'}, {'test_file': 'dynamo/test_sources'}, {'test_file': 'dynamo/test_backward_higher_order_ops'}, {'test_file': 'dynamo/test_modes'}, {'test_file': 'dynamo/test_optimizers'}, {'test_file': 'export/test_torchbind'}, {'test_file': 'inductor/test_custom_partitioner_fn'}, {'test_file': 'dynamo/test_debug_utils'}, {'test_file': 'dynamo/test_base_hop'}, {'test_file': 'dynamo/test_export'}, {'test_file': 'dynamo/test_package'}, {'test_file': 'inductor/test_efficient_conv_bn_eval'}, {'test_file': 'inductor/test_torchbind'}, {'test_file': 'dynamo/test_python_dispatcher'}, {'test_file': 'export/test_swap'}, {'test_file': 'export/test_unflatten'}, {'test_file': 'dynamo/test_verify_correctness'}, {'test_file': 'inductor/test_fxir_backend'}, {'test_file': 'dynamo/test_cudagraphs_expandable_segments'}, {'test_file': 'inductor/test_caching'}, {'test_file': 'dynamo/test_aot_autograd_cache'}, {'test_file': 'dynamo/test_flat_apply'}, {'test_file': 'dynamo/test_input_attr_tracking'}, {'test_file': 'dynamo/test_graph_deduplication'}, {'test_file': 'inductor/test_distributed_patterns'}, {'test_file': 'dynamo/test_structured_trace'}, {'test_file': 'dynamo/test_error_messages'}, {'test_file': 'dynamo/test_bytecode_utils'}, {'test_file': 'dynamo/test_fake_distributed'}, {'test_file': 'inductor/test_mps_basic'}, {'test_file': 'export/test_nativert'}, {'test_file': 'export/test_hop'}, {'test_file': 'dynamo/test_tree_map'}, {'test_file': 'dynamo/test_minifier'}, {'test_file': 'dynamo/test_guard_manager'}, {'test_file': 'export/test_schema'}, {'test_file': 'dynamo/test_torchrec'}, {'test_file': 'export/test_pass_infra'}, {'test_file': 'export/test_experimental'}, {'test_file': 'export/test_converter'}, {'test_file': 'export/test_export'}, {'test_file': 'test_model_exports_to_core_aten'}, {'test_file': 'dynamo/test_precompile_context'}, {'test_file': 'dynamo/test_trace_rules'}, {'test_file': 'export/test_upgrader'}, {'test_file': 'dynamo/test_hooks'}, {'test_file': 'dynamo/test_reorder_logs'}, {'test_file': 'dynamo/test_subclasses'}, {'test_file': 'dynamo/test_exceptions'}, {'test_file': 'dynamo/test_generator'}, {'test_file': 'export/test_lift_unlift'}, {'test_file': 'export/test_verifier'}, {'test_file': 'export/test_sparse'}, {'test_file': 'dynamo/test_python_autograd'}, {'test_file': 'export/test_draft_export'}, {'test_file': 'dynamo/test_comptime'}, {'test_file': 'test_sort_and_select'}, {'test_file': 'functorch/test_rearrange'}, {'test_file': 'functorch/test_parsing'}, {'test_file': 'test_package'}, {'test_file': 'profiler/test_profiler'}, {'test_file': 'test_mkl_verbose'}, {'test_file': 'test_comparison_utils'}, {'test_file': 'functorch/test_ac_logging'}, {'test_file': 'test_mkldnn_verbose'}, {'test_file': 'test_cpp_api_parity'}, {'test_file': 'test_utils_config_module'}, {'test_file': 'test_hop_infra'}, {'test_file': 'test_appending_byte_serializer'}, {'test_file': 'test_license'}, {'test_file': 'test_ao_sparsity'}, {'test_file': 'test_autoload'}, {'test_file': 'nn/attention/test_open_registry'}, {'test_file': 'xpu/test_fusion'}, {'test_file': 'test_as_strided'}, {'test_file': 'test_foreach'}, {'test_file': 'test_proxy_tensor'}, {'test_file': 'torch_np/test_binary_ufuncs'}, {'test_file': 'torch_np/test_unary_ufuncs'}, {'test_file': 'test_utils_filelock'}, {'test_file': 'test_extension_utils'}, {'test_file': 'test_rename_privateuse1_to_existing_device'}, {'test_file': 'nn/attention/test_fa4'}, {'test_file': 'typing/test_python_operators'}, {'test_file': 'test_functionalization'}, {'test_file': 'torch_np/test_dtype'}, {'test_file': 'test_file_check'}, {'test_file': 'profiler/test_kineto'}, {'test_file': 'test_flop_counter'}, {'test_file': 'backends/xeon/test_launch'}, {'test_file': 'test_show_pickle'}, {'test_file': 'test_openmp'}, {'test_file': 'test_expanded_weights'}, {'test_file': 'test_module_tracker'}, {'test_file': 'torch_np/numpy_tests/core/test_scalarinherit'}, {'test_file': 'test_tensorexpr_pybind'}, {'test_file': 'test_fx_experimental'}, {'test_file': 'functorch/test_ac_knapsack'}, {'test_file': 'torch_np/test_nep50_examples'}, {'test_file': 'test_torch'}, {'test_file': 'xpu/test_gemm'}, {'test_file': 'test_fx_passes'}, {'test_file': 'functorch/test_logging'}, {'test_file': 'test_namedtensor'}, {'test_file': 'test_tensorexpr'}, {'test_file': 'functorch/test_minifier'}, {'test_file': 'higher_order_ops/test_invoke_quant'}, {'test_file': 'torch_np/test_basic'}, {'test_file': 'test_jiterator'}, {'test_file': 'test_native_functions'}, {'test_file': 'test_typing'}, {'test_file': 'higher_order_ops/test_with_effects'}, {'test_file': 'test_weak'}, {'test_file': 'test_complex'}, {'test_file': 'test_optim'}, {'test_file': 'lazy/test_functionalization'}, {'test_file': 'torch_np/test_random'}, {'test_file': 'nn/test_multihead_attention'}, {'test_file': 'test_legacy_vmap'}, {'test_file': 'lazy/test_bindings'}, {'test_file': 'xpu/test_conv'}, {'test_file': 'test_utils'}, {'test_file': 'test_pytree'}, {'test_file': 'test_namedtuple_return_api'}, {'test_file': 'profiler/test_record_function'}, {'test_file': 'test_compile_benchmark_util'}, {'test_file': 'test_set_default_mobile_cpu_allocator'}, {'test_file': 'test_fake_tensor'}, {'test_file': 'test_stateless'}, {'test_file': 'functorch/test_ac'}, {'test_file': 'test_binary_ufuncs'}, {'test_file': 'higher_order_ops/test_print'}, {'test_file': 'test_per_overload_api'}, {'test_file': 'torch_np/numpy_tests/core/test_einsum'}, {'test_file': 'test_multiprocessing'}, {'test_file': 'test_out_dtype_op'}, {'test_file': 'torch_np/test_ufuncs_basic'}, {'test_file': 'lazy/test_step_closures'}, {'test_file': 'functorch/dim/test_getsetitem'}, {'test_file': 'test_numpy_interop'}, {'test_file': 'profiler/test_cpp_thread'}, {'test_file': 'test_segment_reductions'}, {'test_file': 'test_opaque_obj_v2'}, {'test_file': 'test_autograd_fallback'}, {'test_file': 'test_type_hints'}, {'test_file': 'functorch/test_aot_joint_with_descriptors'}, {'test_file': 'test_functionalization_of_rng_ops'}, {'test_file': 'test_fx_reinplace_pass'}, {'test_file': 'functorch/test_control_flow'}, {'test_file': 'test_modules'}, {'test_file': 'nn/test_packed_sequence'}, {'test_file': 'test_numa_binding'}, {'test_file': 'test_pruning_op'}, {'test_file': 'test_jit_fuser_te'}, {'test_file': 'test_autocast'}, {'test_file': 'test_logging'}, {'test_file': 'test_python_dispatch'}, {'test_file': 'nn/test_lazy_modules'}, {'test_file': 'nn/test_pruning'}, {'test_file': 'test_monitor'}, {'test_file': 'test_cuda_sanitizer'}, {'test_file': 'test_bundled_inputs'}, {'test_file': 'torch_np/numpy_tests/core/test_numeric'}, {'test_file': 'torch_np/numpy_tests/core/test_multiarray'}, {'test_file': 'test_itt'}, {'test_file': 'torch_np/numpy_tests/lib/test_function_base'}, {'test_file': 'test_masked'}, {'test_file': 'test_sympy_utils'}, {'test_file': 'test_jit_disabled'}, {'test_file': 'test_subclass'}, {'test_file': 'test_import_stats'}, {'test_file': 'functorch/test_vmap_registrations'}, {'test_file': 'nn/test_parametrization'}, {'test_file': 'test_mkldnn_fusion'}, {'test_file': 'test_cpp_extensions_mtia_backend'}, {'test_file': 'lazy/test_ts_opinfo'}, {'test_file': 'test_dynamic_shapes'}, {'test_file': 'complex_tensor/test_complex_tensor'}, {'test_file': 'optim/test_lrscheduler'}, {'test_file': 'optim/test_swa_utils'}, {'test_file': 'cpp_extensions/python_agnostic_extension/test/test_python_agnostic'}, {'test_file': 'functorch/test_memory_efficient_fusion'}, {'test_file': 'torch_np/numpy_tests/lib/test_histograms'}, {'test_file': 'torch_np/test_indexing'}, {'test_file': 'test_schema_check'}, {'test_file': 'test_tensorboard'}, {'test_file': 'torch_np/numpy_tests/core/test_indexing'}, {'test_file': 'test_futures'}, {'test_file': 'test_tensor_creation_ops'}, {'test_file': 'nn/test_dropout'}, {'test_file': 'functorch/dim/test_split'}, {'test_file': 'torch_np/numpy_tests/lib/test_type_check'}, {'test_file': 'cpp_extensions/test_libtorch_agnostic'}, {'test_file': 'test_cpp_extensions_stream_and_event'}, {'test_file': 'profiler/test_execution_trace'}, {'test_file': 'test_dispatch'}, {'test_file': 'test_datapipe'}, {'test_file': 'test_numba_integration'}, {'test_file': 'test_functional_optim'}, {'test_file': 'test_maskedtensor'}, {'test_file': 'benchmark_utils/test_benchmark_utils'}, {'test_file': 'torch_np/numpy_tests/core/test_scalarmath'}, {'test_file': 'test_scaled_matmul_cuda'}, {'test_file': 'torch_np/numpy_tests/core/test_shape_base'}, {'test_file': 'test_vulkan'}, {'test_file': 'lazy/test_generator'}, {'test_file': 'nn/test_convolution'}, {'test_file': 'torch_np/numpy_tests/linalg/test_linalg'}, {'test_file': 'torch_np/numpy_tests/core/test_dtype'}, {'test_file': 'lazy/test_debug_util'}, {'test_file': 'nn/test_load_state_dict'}, {'test_file': 'test_shape_ops'}, {'test_file': 'nn/test_module_hooks'}, {'test_file': 'torch_np/numpy_tests/lib/test_twodim_base'}, {'test_file': 'profiler/test_memory_profiler'}, {'test_file': 'test_jit_llga_fuser'}, {'test_file': 'test_serialization'}, {'test_file': 'optim/test_optim'}, {'test_file': 'test_indexing'}, {'test_file': 'torch_np/numpy_tests/fft/test_pocketfft'}, {'test_file': 'torch_np/numpy_tests/lib/test_shape_base_'}, {'test_file': 'torch_np/numpy_tests/core/test_getlimits'}, {'test_file': 'torch_np/test_ndarray_methods'}, {'test_file': 'test_view_ops'}, {'test_file': 'test_type_info'}, {'test_file': 'functorch/test_aotdispatch'}, {'test_file': 'test_nn'}, {'test_file': 'torch_np/numpy_tests/core/test_dlpack'}, {'test_file': 'test_multiprocessing_spawn'}, {'test_file': 'test_scatter_gather_ops'}, {'test_file': 'test_cuda_multigpu'}, {'test_file': 'test_mkldnn'}, {'test_file': 'torch_np/numpy_tests/lib/test_index_tricks'}, {'test_file': 'test_jit_autocast'}, {'test_file': 'nn/test_pooling'}, {'test_file': 'nn/test_embedding'}, {'test_file': 'test_unary_ufuncs'}, {'test_file': 'test_xnnpack_integration'}, {'test_file': 'test_cuda_trace'}, {'test_file': 'test_native_mha'}, {'test_file': 'torch_np/numpy_tests/core/test_numerictypes'}, {'test_file': 'test_cuda_nvml_based_avail'}, {'test_file': 'test_function_schema'}, {'test_file': 'test_accelerator'}, {'test_file': 'nn/test_init'}, {'test_file': 'torch_np/numpy_tests/core/test_scalar_methods'}, {'test_file': 'torch_np/numpy_tests/fft/test_helper'}, {'test_file': 'test_mobile_optimizer'}, {'test_file': 'torch_np/test_function_base'}, {'test_file': 'test_type_promotion'}, {'test_file': 'torch_np/test_scalars_0D_arrays'}, {'test_file': 'test_cuda_primary_ctx'}, {'test_file': 'profiler/test_profiler_tree'}, {'test_file': 'torch_np/numpy_tests/lib/test_arraysetops'}, {'test_file': 'test_dlpack'}, {'test_file': 'profiler/test_torch_tidy'}, {'test_file': 'lazy/test_reuse_ir'}, {'test_file': 'test_functional_autograd_benchmark'}, {'test_file': 'test_reductions'}, {'test_file': 'torch_np/test_reductions'}, {'test_file': 'torch_np/numpy_tests/core/test_scalar_ctors'}, {'test_file': 'torch_np/numpy_tests/lib/test_arraypad'}, {'test_file': 'test_prims'}, {'test_file': 'test_spectral_ops'}, {'test_file': 'profiler/test_python_tracer'}, {'test_file': 'cpp_extensions/libtorch_agnostic_2_10_extension/test_version_compatibility'}, {'test_file': 'distributions/test_distributions'}, {'test_file': 'test_autoload_disable'}, {'test_file': 'test_autoload_enable'}, {'test_file': 'test_cpp_extensions_aot_ninja'}, {'test_file': 'test_cpp_extensions_aot_no_ninja'}], 'excluded': []} from test/test-reports/td_exclusions-8a7938a9bff888fe1c29.json is not a benchmark record, skipping 2025-12-04T12:25:31.4212019Z warn(f"{result} from {filepath} is not a benchmark record, skipping") 2025-12-04T12:25:31.4399766Z ##[group]Run cat test/**/*_toprint.log || true 2025-12-04T12:25:31.4400158Z cat test/**/*_toprint.log || true 2025-12-04T12:25:31.4408973Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:31.4409346Z env: 2025-12-04T12:25:31.4409599Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:31.4409868Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:31.4410180Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:31.4410790Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:31.4411275Z DEVICE_NAME: 2025-12-04T12:25:31.4411504Z DEVICE_TYPE: 2025-12-04T12:25:31.4411826Z ##[endgroup] 2025-12-04T12:25:31.4568390Z cat: 'test/**/*_toprint.log': No such file or directory 2025-12-04T12:25:31.4790327Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2025-12-04T12:25:31.4790727Z kill "$MONITOR_SCRIPT_PID" 2025-12-04T12:25:31.4799963Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:31.4800324Z env: 2025-12-04T12:25:31.4800550Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:31.4800824Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:31.4801139Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:31.4801693Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:31.4802176Z DEVICE_NAME: 2025-12-04T12:25:31.4802406Z DEVICE_TYPE: 2025-12-04T12:25:31.4802647Z MONITOR_SCRIPT_PID: 59305 2025-12-04T12:25:31.4802914Z ##[endgroup] 2025-12-04T12:25:31.4834929Z /home/ec2-user/actions-runner/_work/_temp/e26fa84f-2796-454b-a653-47188a8abc36.sh: line 1: kill: (59305) - No such process 2025-12-04T12:25:31.4845912Z ##[error]Process completed with exit code 1. 2025-12-04T12:25:31.5028004Z Prepare all required actions 2025-12-04T12:25:31.5028397Z Getting action download info 2025-12-04T12:25:31.6698860Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-12-04T12:25:32.3126566Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-12-04T12:25:34.3057525Z ##[group]Run ./.github/actions/upload-test-artifacts 2025-12-04T12:25:34.3057867Z with: 2025-12-04T12:25:34.3058233Z file-suffix: test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206 2025-12-04T12:25:34.3058674Z s3-bucket: gha-artifacts 2025-12-04T12:25:34.3058930Z env: 2025-12-04T12:25:34.3059141Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:34.3059410Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:34.3059727Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:34.3060273Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:34.3060790Z DEVICE_NAME: 2025-12-04T12:25:34.3061014Z DEVICE_TYPE: 2025-12-04T12:25:34.3061242Z ##[endgroup] 2025-12-04T12:25:34.3215670Z ##[group]Run # Remove any previous test jsons if they exist 2025-12-04T12:25:34.3216118Z # Remove any previous test jsons if they exist 2025-12-04T12:25:34.3216489Z rm -f test-jsons-*.zip 2025-12-04T12:25:34.3216913Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test/test-reports -i '*.json' 2025-12-04T12:25:34.3226661Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:34.3227026Z env: 2025-12-04T12:25:34.3227256Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:34.3227529Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:34.3227843Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:34.3228405Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:34.3228893Z DEVICE_NAME: 2025-12-04T12:25:34.3229129Z DEVICE_TYPE: 2025-12-04T12:25:34.3229509Z FILE_SUFFIX: test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206 2025-12-04T12:25:34.3229930Z ##[endgroup] 2025-12-04T12:25:34.4057662Z adding: test/test-reports/td_exclusions-8a7938a9bff888fe1c29.json (deflated 82%) 2025-12-04T12:25:34.4069336Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-b20db2ea014300dc.json (deflated 94%) 2025-12-04T12:25:34.4101741Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_codegen_dynamic_shapes/inductor.test_torchinductor_codegen_dynamic_shapes-199c8a5f5951ffa6.json (deflated 94%) 2025-12-04T12:25:34.4108161Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-97012a12d7277b61.json (deflated 96%) 2025-12-04T12:25:34.4114802Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-4ca7743e44bf290c.json (deflated 96%) 2025-12-04T12:25:34.4149864Z adding: test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-9265b0e9898bf9b1.json (deflated 96%) 2025-12-04T12:25:34.4151050Z adding: test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-de1c2d4fca581e1f.json (deflated 74%) 2025-12-04T12:25:34.4164461Z adding: test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-e43b32b50674a33e.json (deflated 96%) 2025-12-04T12:25:34.4165685Z adding: test/test-reports/python-pytest/inductor.test_deterministic/inductor.test_deterministic-c7eee0aee9f5b9d8.json (deflated 88%) 2025-12-04T12:25:34.4168392Z adding: test/test-reports/python-pytest/dynamo.test_sets/dynamo.test_sets-c3cfb01bc5b67b47.json (deflated 94%) 2025-12-04T12:25:34.4169807Z adding: test/test-reports/python-pytest/dynamo.test_compiler_bisector/dynamo.test_compiler_bisector-8187201dc9315a4d.json (deflated 87%) 2025-12-04T12:25:34.4177080Z adding: test/test-reports/python-pytest/dynamo.test_decorators/dynamo.test_decorators-0e583763ebd3b3fc.json (deflated 91%) 2025-12-04T12:25:34.4566793Z adding: test/test-reports/python-pytest/test_transformers/test_transformers-4fb24eefb9a01bd4.json (deflated 99%) 2025-12-04T12:25:34.4574042Z adding: test/test-reports/python-pytest/test_ops_fwd_gradients/test_ops_fwd_gradients-5dbc324bf3632362.json (deflated 96%) 2025-12-04T12:25:34.4581793Z adding: test/test-reports/python-pytest/test_ops_fwd_gradients/test_ops_fwd_gradients-2cb8345fee0d20d1.json (deflated 95%) 2025-12-04T12:25:34.4598075Z adding: test/test-reports/python-pytest/test_ops_gradients/test_ops_gradients-d3fb84b182f25804.json (deflated 96%) 2025-12-04T12:25:34.4628493Z adding: test/test-reports/python-pytest/test_linalg/test_linalg-ecb4fcce602354d1.json (deflated 97%) 2025-12-04T12:25:34.4674702Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-c5a286cc3fc0ce2c.json (deflated 95%) 2025-12-04T12:25:34.4694091Z adding: test/test-reports/python-pytest/inductor.test_cpu_repro/inductor.test_cpu_repro-cff3315ca24db3df.json (deflated 96%) 2025-12-04T12:25:34.4703230Z adding: test/test-reports/python-pytest/dynamo.test_ctx_manager/dynamo.test_ctx_manager-9d815792ffb0d789.json (deflated 93%) 2025-12-04T12:25:34.4704290Z adding: test/test-reports/python-pytest/inductor.test_cpu_select_algorithm/inductor.test_cpu_select_algorithm-5324e07d08f84675.json (stored 0%) 2025-12-04T12:25:34.4717798Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_strided_blocks/inductor.test_torchinductor_strided_blocks-37bb13dfab68bc3d.json (deflated 97%) 2025-12-04T12:25:34.4722034Z adding: test/test-reports/python-pytest/inductor.test_multi_kernel/inductor.test_multi_kernel-255fcbe2f02e1ae3.json (deflated 88%) 2025-12-04T12:25:34.4726960Z adding: test/test-reports/python-pytest/inductor.test_pad_mm/inductor.test_pad_mm-1824b4b320322926.json (deflated 92%) 2025-12-04T12:25:34.4733123Z adding: test/test-reports/python-pytest/test_sparse_semi_structured/test_sparse_semi_structured-77ff603b5ea79d9b.json (deflated 95%) 2025-12-04T12:25:34.4750723Z adding: test/test-reports/python-pytest/inductor.test_gpu_cpp_wrapper/inductor.test_gpu_cpp_wrapper-b432c653bfe7053d.json (deflated 94%) 2025-12-04T12:25:34.4752015Z adding: test/test-reports/python-pytest/inductor.test_move_constructors_to_gpu/inductor.test_move_constructors_to_gpu-350dc6b168b259f1.json (deflated 85%) 2025-12-04T12:25:34.4753058Z adding: test/test-reports/python-pytest/dynamo.test_list/dynamo.test_list-89d839f900e86b8b.json (deflated 94%) 2025-12-04T12:25:34.4753946Z adding: test/test-reports/python-pytest/dynamo.test_resume/dynamo.test_resume-966072d129aef6be.json (deflated 42%) 2025-12-04T12:25:34.4754997Z adding: test/test-reports/python-pytest/inductor.test_augmented_graph_helper/inductor.test_augmented_graph_helper-9629e19c1d2aae8f.json (deflated 93%) 2025-12-04T12:25:34.4756093Z adding: test/test-reports/python-pytest/dynamo.test_deviceguard/dynamo.test_deviceguard-44dacdbd0b626b15.json (deflated 76%) 2025-12-04T12:25:34.4757140Z adding: test/test-reports/python-pytest/dynamo.test_sources/dynamo.test_sources-c8e9ccd7fdd84d86.json (deflated 69%) 2025-12-04T12:25:34.4758206Z adding: test/test-reports/python-pytest/dynamo.test_backward_higher_order_ops/dynamo.test_backward_higher_order_ops-bea32613ba34e2b9.json (deflated 77%) 2025-12-04T12:25:34.4759292Z adding: test/test-reports/python-pytest/dynamo.test_optimizers/dynamo.test_optimizers-523ecb2595aa9e7c.json (deflated 69%) 2025-12-04T12:25:34.4760366Z adding: test/test-reports/python-pytest/inductor.test_custom_partitioner_fn/inductor.test_custom_partitioner_fn-096ee008790c4efc.json (deflated 50%) 2025-12-04T12:25:34.4761445Z adding: test/test-reports/python-pytest/dynamo.test_debug_utils/dynamo.test_debug_utils-7d0242e9dff3a28b.json (deflated 76%) 2025-12-04T12:25:34.4762391Z adding: test/test-reports/python-pytest/dynamo.test_base_hop/dynamo.test_base_hop-39440f05cd7c387c.json (deflated 79%) 2025-12-04T12:25:34.4769772Z adding: test/test-reports/python-pytest/dynamo.test_export/dynamo.test_export-8a85a11820ca2c3e.json (deflated 90%) 2025-12-04T12:25:34.4770819Z adding: test/test-reports/python-pytest/dynamo.test_python_dispatcher/dynamo.test_python_dispatcher-3db405fd4b8ecc70.json (deflated 84%) 2025-12-04T12:25:34.4771828Z adding: test/test-reports/python-pytest/export.test_swap/export.test_swap-7059f70fd3ff66c0.json (deflated 94%) 2025-12-04T12:25:34.4773350Z adding: test/test-reports/python-pytest/export.test_unflatten/export.test_unflatten-2fe9b333652b1d69.json (deflated 94%) 2025-12-04T12:25:34.4774387Z adding: test/test-reports/python-pytest/dynamo.test_verify_correctness/dynamo.test_verify_correctness-0cd0444895d93ef1.json (deflated 72%) 2025-12-04T12:25:34.4778834Z adding: test/test-reports/python-pytest/inductor.test_fxir_backend/inductor.test_fxir_backend-3637a5683264f627.json (deflated 92%) 2025-12-04T12:25:34.4779849Z adding: test/test-reports/python-pytest/dynamo.test_flat_apply/dynamo.test_flat_apply-5b8baa80f2add320.json (deflated 74%) 2025-12-04T12:25:34.4780884Z adding: test/test-reports/python-pytest/dynamo.test_input_attr_tracking/dynamo.test_input_attr_tracking-e7186c64c589dbc7.json (deflated 83%) 2025-12-04T12:25:34.4782015Z adding: test/test-reports/python-pytest/dynamo.test_graph_deduplication/dynamo.test_graph_deduplication-146b0afa707c650e.json (deflated 90%) 2025-12-04T12:25:34.4783467Z adding: test/test-reports/python-pytest/inductor.test_distributed_patterns/inductor.test_distributed_patterns-a225771515eff859.json (deflated 89%) 2025-12-04T12:25:34.4785666Z adding: test/test-reports/python-pytest/dynamo.test_structured_trace/dynamo.test_structured_trace-36a47fd4af733696.json (deflated 89%) 2025-12-04T12:25:34.4786736Z adding: test/test-reports/python-pytest/dynamo.test_fake_distributed/dynamo.test_fake_distributed-21ff8fa13be61f59.json (deflated 73%) 2025-12-04T12:25:34.4787740Z adding: test/test-reports/python-pytest/export.test_nativert/export.test_nativert-11993ef1b36ec884.json (deflated 88%) 2025-12-04T12:25:34.4789134Z adding: test/test-reports/python-pytest/export.test_hop/export.test_hop-d9a65f9d494897c7.json (deflated 94%) 2025-12-04T12:25:34.4790177Z adding: test/test-reports/python-pytest/test_model_exports_to_core_aten/test_model_exports_to_core_aten-46fb204eaca6b9f2.json (deflated 59%) 2025-12-04T12:25:34.4791280Z adding: test/test-reports/python-pytest/dynamo.test_precompile_context/dynamo.test_precompile_context-7bf3bf27c616b5c5.json (deflated 81%) 2025-12-04T12:25:34.4792317Z adding: test/test-reports/python-pytest/dynamo.test_trace_rules/dynamo.test_trace_rules-ee853126f21c3a7b.json (deflated 78%) 2025-12-04T12:25:34.4793286Z adding: test/test-reports/python-pytest/export.test_upgrader/export.test_upgrader-f8a313da87b3ac32.json (deflated 82%) 2025-12-04T12:25:34.4794210Z adding: test/test-reports/python-pytest/dynamo.test_hooks/dynamo.test_hooks-b07cb839d1e57fc7.json (deflated 88%) 2025-12-04T12:25:34.4797871Z adding: test/test-reports/python-pytest/export.test_sparse/export.test_sparse-4d7ad8fd2ec6dc81.json (deflated 96%) 2025-12-04T12:25:34.4816447Z adding: test/test-reports/python-pytest/test_modules/test_modules-8463597ee5493803.json (deflated 96%) 2025-12-04T12:25:34.4829527Z adding: test/test-reports/python-pytest/complex_tensor.test_complex_tensor/complex_tensor.test_complex_tensor-8b23cb700c6b2092.json (deflated 97%) 2025-12-04T12:25:34.4830955Z adding: test/test-reports/python-pytest/benchmark_utils.test_benchmark_utils/benchmark_utils.test_benchmark_utils-2888f3c320a1b2e3.json (deflated 87%) 2025-12-04T12:25:34.4836261Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_scalarmath/torch_np.numpy_tests.core.test_scalarmath-4a75920c20f79d0b.json (deflated 97%) 2025-12-04T12:25:34.4847267Z adding: test/test-reports/python-pytest/nn.test_convolution/nn.test_convolution-8ad1e6986509c27f.json (deflated 96%) 2025-12-04T12:25:34.4907192Z ##[group]Run # Remove any previous test reports if they exist 2025-12-04T12:25:34.4907816Z # Remove any previous test reports if they exist 2025-12-04T12:25:34.4908203Z rm -f test-reports-*.zip 2025-12-04T12:25:34.4908691Z zip -r "test-reports-${FILE_SUFFIX}.zip" test/test-reports -i '*.xml' -i '*.csv' 2025-12-04T12:25:34.4917867Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:34.4918225Z env: 2025-12-04T12:25:34.4918440Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:34.4918700Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:34.4919017Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:34.4919566Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:34.4920044Z DEVICE_NAME: 2025-12-04T12:25:34.4920265Z DEVICE_TYPE: 2025-12-04T12:25:34.4920631Z FILE_SUFFIX: test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206 2025-12-04T12:25:34.4921053Z ##[endgroup] 2025-12-04T12:25:34.5032103Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-b20db2ea014300dc.xml (deflated 92%) 2025-12-04T12:25:34.5060983Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_codegen_dynamic_shapes/inductor.test_torchinductor_codegen_dynamic_shapes-199c8a5f5951ffa6.xml (deflated 93%) 2025-12-04T12:25:34.5064736Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-97012a12d7277b61.xml (deflated 93%) 2025-12-04T12:25:34.5070002Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-4ca7743e44bf290c.xml (deflated 93%) 2025-12-04T12:25:34.5102143Z adding: test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-9265b0e9898bf9b1.xml (deflated 96%) 2025-12-04T12:25:34.5103428Z adding: test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-de1c2d4fca581e1f.xml (deflated 73%) 2025-12-04T12:25:34.5116135Z adding: test/test-reports/python-pytest/inductor.test_compile_subprocess/inductor.test_compile_subprocess-e43b32b50674a33e.xml (deflated 96%) 2025-12-04T12:25:34.5117295Z adding: test/test-reports/python-pytest/inductor.test_deterministic/inductor.test_deterministic-c7eee0aee9f5b9d8.xml (deflated 86%) 2025-12-04T12:25:34.5118877Z adding: test/test-reports/python-pytest/dynamo.test_sets/dynamo.test_sets-c3cfb01bc5b67b47.xml (deflated 88%) 2025-12-04T12:25:34.5120373Z adding: test/test-reports/python-pytest/dynamo.test_compiler_bisector/dynamo.test_compiler_bisector-8187201dc9315a4d.xml (deflated 85%) 2025-12-04T12:25:34.5127715Z adding: test/test-reports/python-pytest/dynamo.test_decorators/dynamo.test_decorators-0e583763ebd3b3fc.xml (deflated 89%) 2025-12-04T12:25:34.5418580Z adding: test/test-reports/python-pytest/test_transformers/test_transformers-4fb24eefb9a01bd4.xml (deflated 99%) 2025-12-04T12:25:34.5424787Z adding: test/test-reports/python-pytest/test_ops_fwd_gradients/test_ops_fwd_gradients-5dbc324bf3632362.xml (deflated 94%) 2025-12-04T12:25:34.5431064Z adding: test/test-reports/python-pytest/test_ops_fwd_gradients/test_ops_fwd_gradients-2cb8345fee0d20d1.xml (deflated 94%) 2025-12-04T12:25:34.5444527Z adding: test/test-reports/python-pytest/test_ops_gradients/test_ops_gradients-d3fb84b182f25804.xml (deflated 95%) 2025-12-04T12:25:34.5467593Z adding: test/test-reports/python-pytest/test_linalg/test_linalg-ecb4fcce602354d1.xml (deflated 96%) 2025-12-04T12:25:34.5504431Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-c5a286cc3fc0ce2c.xml (deflated 93%) 2025-12-04T12:25:34.5521864Z adding: test/test-reports/python-pytest/inductor.test_cpu_repro/inductor.test_cpu_repro-cff3315ca24db3df.xml (deflated 96%) 2025-12-04T12:25:34.5530839Z adding: test/test-reports/python-pytest/dynamo.test_ctx_manager/dynamo.test_ctx_manager-9d815792ffb0d789.xml (deflated 91%) 2025-12-04T12:25:34.5532163Z adding: test/test-reports/python-pytest/inductor.test_cpu_select_algorithm/inductor.test_cpu_select_algorithm-5324e07d08f84675.xml (deflated 28%) 2025-12-04T12:25:34.5543103Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_strided_blocks/inductor.test_torchinductor_strided_blocks-37bb13dfab68bc3d.xml (deflated 96%) 2025-12-04T12:25:34.5547305Z adding: test/test-reports/python-pytest/inductor.test_multi_kernel/inductor.test_multi_kernel-255fcbe2f02e1ae3.xml (deflated 87%) 2025-12-04T12:25:34.5551576Z adding: test/test-reports/python-pytest/inductor.test_pad_mm/inductor.test_pad_mm-1824b4b320322926.xml (deflated 91%) 2025-12-04T12:25:34.5555894Z adding: test/test-reports/python-pytest/test_sparse_semi_structured/test_sparse_semi_structured-77ff603b5ea79d9b.xml (deflated 93%) 2025-12-04T12:25:34.5571402Z adding: test/test-reports/python-pytest/inductor.test_gpu_cpp_wrapper/inductor.test_gpu_cpp_wrapper-b432c653bfe7053d.xml (deflated 93%) 2025-12-04T12:25:34.5572560Z adding: test/test-reports/python-pytest/inductor.test_move_constructors_to_gpu/inductor.test_move_constructors_to_gpu-350dc6b168b259f1.xml (deflated 81%) 2025-12-04T12:25:34.5573604Z adding: test/test-reports/python-pytest/dynamo.test_list/dynamo.test_list-89d839f900e86b8b.xml (deflated 87%) 2025-12-04T12:25:34.5574494Z adding: test/test-reports/python-pytest/dynamo.test_resume/dynamo.test_resume-966072d129aef6be.xml (deflated 42%) 2025-12-04T12:25:34.5575537Z adding: test/test-reports/python-pytest/inductor.test_augmented_graph_helper/inductor.test_augmented_graph_helper-9629e19c1d2aae8f.xml (deflated 86%) 2025-12-04T12:25:34.5576634Z adding: test/test-reports/python-pytest/dynamo.test_deviceguard/dynamo.test_deviceguard-44dacdbd0b626b15.xml (deflated 63%) 2025-12-04T12:25:34.5577602Z adding: test/test-reports/python-pytest/dynamo.test_sources/dynamo.test_sources-c8e9ccd7fdd84d86.xml (deflated 57%) 2025-12-04T12:25:34.5578666Z adding: test/test-reports/python-pytest/dynamo.test_backward_higher_order_ops/dynamo.test_backward_higher_order_ops-bea32613ba34e2b9.xml (deflated 70%) 2025-12-04T12:25:34.5579756Z adding: test/test-reports/python-pytest/dynamo.test_optimizers/dynamo.test_optimizers-523ecb2595aa9e7c.xml (deflated 66%) 2025-12-04T12:25:34.5580935Z adding: test/test-reports/python-pytest/inductor.test_custom_partitioner_fn/inductor.test_custom_partitioner_fn-096ee008790c4efc.xml (deflated 49%) 2025-12-04T12:25:34.5582021Z adding: test/test-reports/python-pytest/dynamo.test_debug_utils/dynamo.test_debug_utils-7d0242e9dff3a28b.xml (deflated 62%) 2025-12-04T12:25:34.5583032Z adding: test/test-reports/python-pytest/dynamo.test_base_hop/dynamo.test_base_hop-39440f05cd7c387c.xml (deflated 74%) 2025-12-04T12:25:34.5589027Z adding: test/test-reports/python-pytest/dynamo.test_export/dynamo.test_export-8a85a11820ca2c3e.xml (deflated 87%) 2025-12-04T12:25:34.5590020Z adding: test/test-reports/python-pytest/dynamo.test_python_dispatcher/dynamo.test_python_dispatcher-3db405fd4b8ecc70.xml (deflated 77%) 2025-12-04T12:25:34.5591009Z adding: test/test-reports/python-pytest/export.test_swap/export.test_swap-7059f70fd3ff66c0.xml (deflated 93%) 2025-12-04T12:25:34.5592665Z adding: test/test-reports/python-pytest/export.test_unflatten/export.test_unflatten-2fe9b333652b1d69.xml (deflated 92%) 2025-12-04T12:25:34.5593705Z adding: test/test-reports/python-pytest/dynamo.test_verify_correctness/dynamo.test_verify_correctness-0cd0444895d93ef1.xml (deflated 64%) 2025-12-04T12:25:34.5597499Z adding: test/test-reports/python-pytest/inductor.test_fxir_backend/inductor.test_fxir_backend-3637a5683264f627.xml (deflated 90%) 2025-12-04T12:25:34.5598496Z adding: test/test-reports/python-pytest/dynamo.test_flat_apply/dynamo.test_flat_apply-5b8baa80f2add320.xml (deflated 63%) 2025-12-04T12:25:34.5599543Z adding: test/test-reports/python-pytest/dynamo.test_input_attr_tracking/dynamo.test_input_attr_tracking-e7186c64c589dbc7.xml (deflated 79%) 2025-12-04T12:25:34.5600654Z adding: test/test-reports/python-pytest/dynamo.test_graph_deduplication/dynamo.test_graph_deduplication-146b0afa707c650e.xml (deflated 87%) 2025-12-04T12:25:34.5602384Z adding: test/test-reports/python-pytest/inductor.test_distributed_patterns/inductor.test_distributed_patterns-a225771515eff859.xml (deflated 86%) 2025-12-04T12:25:34.5604019Z adding: test/test-reports/python-pytest/dynamo.test_structured_trace/dynamo.test_structured_trace-36a47fd4af733696.xml (deflated 86%) 2025-12-04T12:25:34.5605088Z adding: test/test-reports/python-pytest/dynamo.test_fake_distributed/dynamo.test_fake_distributed-21ff8fa13be61f59.xml (deflated 58%) 2025-12-04T12:25:34.5606099Z adding: test/test-reports/python-pytest/export.test_nativert/export.test_nativert-11993ef1b36ec884.xml (deflated 83%) 2025-12-04T12:25:34.5607685Z adding: test/test-reports/python-pytest/export.test_hop/export.test_hop-d9a65f9d494897c7.xml (deflated 93%) 2025-12-04T12:25:34.5608651Z adding: test/test-reports/python-pytest/test_model_exports_to_core_aten/test_model_exports_to_core_aten-46fb204eaca6b9f2.xml (deflated 58%) 2025-12-04T12:25:34.5609743Z adding: test/test-reports/python-pytest/dynamo.test_precompile_context/dynamo.test_precompile_context-7bf3bf27c616b5c5.xml (deflated 76%) 2025-12-04T12:25:34.5610818Z adding: test/test-reports/python-pytest/dynamo.test_trace_rules/dynamo.test_trace_rules-ee853126f21c3a7b.xml (deflated 67%) 2025-12-04T12:25:34.5611818Z adding: test/test-reports/python-pytest/export.test_upgrader/export.test_upgrader-f8a313da87b3ac32.xml (deflated 69%) 2025-12-04T12:25:34.5612745Z adding: test/test-reports/python-pytest/dynamo.test_hooks/dynamo.test_hooks-b07cb839d1e57fc7.xml (deflated 85%) 2025-12-04T12:25:34.5626764Z adding: test/test-reports/python-pytest/export.test_sparse/export.test_sparse-4d7ad8fd2ec6dc81.xml (deflated 93%) 2025-12-04T12:25:34.5629516Z adding: test/test-reports/python-pytest/test_modules/test_modules-8463597ee5493803.xml (deflated 94%) 2025-12-04T12:25:34.5637306Z adding: test/test-reports/python-pytest/complex_tensor.test_complex_tensor/complex_tensor.test_complex_tensor-8b23cb700c6b2092.xml (deflated 96%) 2025-12-04T12:25:34.5638514Z adding: test/test-reports/python-pytest/benchmark_utils.test_benchmark_utils/benchmark_utils.test_benchmark_utils-2888f3c320a1b2e3.xml (deflated 79%) 2025-12-04T12:25:34.5642523Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_scalarmath/torch_np.numpy_tests.core.test_scalarmath-4a75920c20f79d0b.xml (deflated 96%) 2025-12-04T12:25:34.5651651Z adding: test/test-reports/python-pytest/nn.test_convolution/nn.test_convolution-8ad1e6986509c27f.xml (deflated 96%) 2025-12-04T12:25:34.5705494Z ##[group]Run # Remove any previous usage logs if they exist 2025-12-04T12:25:34.5705956Z # Remove any previous usage logs if they exist 2025-12-04T12:25:34.5706318Z rm -f logs-*.zip 2025-12-04T12:25:34.5706673Z zip "logs-${FILE_SUFFIX}.zip" 'usage_log.txt' || true 2025-12-04T12:25:34.5707180Z zip -r "logs-${FILE_SUFFIX}.zip" test/test-reports -i '*.log' || true 2025-12-04T12:25:34.5716269Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:34.5716744Z env: 2025-12-04T12:25:34.5716986Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:34.5717253Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:34.5717585Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:34.5718145Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:34.5718637Z DEVICE_NAME: 2025-12-04T12:25:34.5718864Z DEVICE_TYPE: 2025-12-04T12:25:34.5719235Z FILE_SUFFIX: test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206 2025-12-04T12:25:34.5719662Z ##[endgroup] 2025-12-04T12:25:34.5793865Z adding: usage_log.txt (deflated 58%) 2025-12-04T12:25:34.5838090Z adding: test/test-reports/inductor.test_aot_inductor_4.5_12641f387b8fb16f_.log (deflated 90%) 2025-12-04T12:25:34.5856731Z adding: test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_3.4_eaaf38f1800870c1_.log (deflated 93%) 2025-12-04T12:25:34.5866333Z adding: test/test-reports/inductor.test_torchinductor_opinfo_6.14_65e2e2b656ebd086_.log (deflated 91%) 2025-12-04T12:25:34.5874736Z adding: test/test-reports/inductor.test_torchinductor_opinfo_14.14_73c797de8513cfc8_.log (deflated 91%) 2025-12-04T12:25:34.5929746Z adding: test/test-reports/inductor.test_compile_subprocess_1.2_f3ccfd39c19f4fed_.log (deflated 95%) 2025-12-04T12:25:34.5930520Z adding: test/test-reports/inductor.test_deterministic_1.3_01f8a4a4771bae7d_.log (deflated 82%) 2025-12-04T12:25:34.5933457Z adding: test/test-reports/dynamo.test_sets_1.1_015994dafeaf086f_.log (deflated 87%) 2025-12-04T12:25:34.5934384Z adding: test/test-reports/dynamo.test_compiler_bisector_1.1_b2d443336d449e1c_.log (deflated 74%) 2025-12-04T12:25:34.5936916Z adding: test/test-reports/dynamo.test_decorators_1.1_6109b86b03700c73_.log (deflated 85%) 2025-12-04T12:25:34.6324242Z adding: test/test-reports/test_transformers_1.1_173f13e8d6354705_.log (deflated 98%) 2025-12-04T12:25:34.6332937Z adding: test/test-reports/test_ops_fwd_gradients_3.12_23fc26b77348a9ee_.log (deflated 92%) 2025-12-04T12:25:34.6340456Z adding: test/test-reports/test_ops_fwd_gradients_11.12_f79076c6c96ab27b_.log (deflated 92%) 2025-12-04T12:25:34.6356134Z adding: test/test-reports/test_ops_gradients_7.10_0772651414b4f002_.log (deflated 92%) 2025-12-04T12:25:34.6387279Z adding: test/test-reports/test_linalg_1.1_598146089d6b6d6f_.log (deflated 94%) 2025-12-04T12:25:34.6430323Z adding: test/test-reports/functorch.test_ops_2.6_77c851249c6366e1_.log (deflated 92%) 2025-12-04T12:25:34.6443535Z adding: test/test-reports/inductor.test_cpu_repro_2.2_6e51b985f7d36046_.log (deflated 94%) 2025-12-04T12:25:34.6446764Z adding: test/test-reports/dynamo.test_ctx_manager_1.1_66ea9ab40c2c7714_.log (deflated 88%) 2025-12-04T12:25:34.6447512Z adding: test/test-reports/inductor.test_cpu_select_algorithm_1.1_5984ab56237313fd_.log (deflated 51%) 2025-12-04T12:25:34.6457167Z adding: test/test-reports/inductor.test_torchinductor_strided_blocks_1.1_b1fb6d4ac5e5d775_.log (deflated 95%) 2025-12-04T12:25:34.6457979Z adding: test/test-reports/inductor.test_multi_kernel_1.1_bec4b3df976b796a_.log (deflated 78%) 2025-12-04T12:25:34.6458704Z adding: test/test-reports/inductor.test_pad_mm_1.1_8ac647038af06d96_.log (deflated 75%) 2025-12-04T12:25:34.6465828Z adding: test/test-reports/test_sparse_semi_structured_1.1_fe0272a596a3ae57_.log (deflated 94%) 2025-12-04T12:25:34.6476269Z adding: test/test-reports/inductor.test_gpu_cpp_wrapper_1.1_a0299d1d2a0dc96c_.log (deflated 93%) 2025-12-04T12:25:34.6477046Z adding: test/test-reports/inductor.test_move_constructors_to_gpu_1.1_7d3bfc88dc865b70_.log (deflated 70%) 2025-12-04T12:25:34.6478008Z adding: test/test-reports/dynamo.test_list_1.1_d6d47e0a20ceaf24_.log (deflated 81%) 2025-12-04T12:25:34.6478659Z adding: test/test-reports/dynamo.test_resume_1.1_4b411187985d50e8_.log (deflated 50%) 2025-12-04T12:25:34.6479722Z adding: test/test-reports/inductor.test_augmented_graph_helper_1.1_d973a35117a850a6_.log (deflated 82%) 2025-12-04T12:25:34.6480489Z adding: test/test-reports/dynamo.test_deviceguard_1.1_a09319d54f6cd115_.log (deflated 61%) 2025-12-04T12:25:34.6481269Z adding: test/test-reports/dynamo.test_sources_1.1_ec074f68baeeb3a5_.log (deflated 56%) 2025-12-04T12:25:34.6482268Z adding: test/test-reports/dynamo.test_backward_higher_order_ops_1.1_65963bb6de763503_.log (deflated 72%) 2025-12-04T12:25:34.6483032Z adding: test/test-reports/dynamo.test_optimizers_1.1_6a53efce4d52876b_.log (deflated 56%) 2025-12-04T12:25:34.6483771Z adding: test/test-reports/inductor.test_custom_partitioner_fn_1.1_76a9f83dca8c1ec8_.log (deflated 55%) 2025-12-04T12:25:34.6484517Z adding: test/test-reports/dynamo.test_debug_utils_1.1_6f3ba3f6dc151c68_.log (deflated 62%) 2025-12-04T12:25:34.6485194Z adding: test/test-reports/dynamo.test_base_hop_1.1_047db26b5c066b2b_.log (deflated 71%) 2025-12-04T12:25:34.6496894Z adding: test/test-reports/dynamo.test_export_1.1_0d51c806ddeb10f0_.log (deflated 91%) 2025-12-04T12:25:34.6497665Z adding: test/test-reports/dynamo.test_python_dispatcher_1.1_22f9013b571f1b8c_.log (deflated 69%) 2025-12-04T12:25:34.6498561Z adding: test/test-reports/export.test_swap_1.1_f1e06ac76a1c3f52_.log (deflated 78%) 2025-12-04T12:25:34.6499667Z adding: test/test-reports/export.test_unflatten_1.1_af789f1c183dcc4f_.log (deflated 79%) 2025-12-04T12:25:34.6500557Z adding: test/test-reports/dynamo.test_verify_correctness_1.1_80f11db651815ea1_.log (deflated 67%) 2025-12-04T12:25:34.6503310Z adding: test/test-reports/inductor.test_fxir_backend_1.1_3198189fbbe435ae_.log (deflated 84%) 2025-12-04T12:25:34.6504105Z adding: test/test-reports/dynamo.test_flat_apply_1.1_9a81e6ad28eceb5e_.log (deflated 62%) 2025-12-04T12:25:34.6504935Z adding: test/test-reports/dynamo.test_input_attr_tracking_1.1_b95b69d0a98c91c6_.log (deflated 78%) 2025-12-04T12:25:34.6505805Z adding: test/test-reports/dynamo.test_graph_deduplication_1.1_06acfa57009690f5_.log (deflated 80%) 2025-12-04T12:25:34.6506719Z adding: test/test-reports/inductor.test_distributed_patterns_1.1_7e5bf6d90d3b740d_.log (deflated 82%) 2025-12-04T12:25:34.6508742Z adding: test/test-reports/dynamo.test_structured_trace_1.1_a4f4f804030b4054_.log (deflated 82%) 2025-12-04T12:25:34.6509483Z adding: test/test-reports/dynamo.test_fake_distributed_1.1_e5311a05636d963a_.log (deflated 60%) 2025-12-04T12:25:34.6510188Z adding: test/test-reports/export.test_nativert_1.1_d6bd2abd6f18e6a8_.log (deflated 67%) 2025-12-04T12:25:34.6518577Z adding: test/test-reports/export.test_hop_1.1_a564c4d3545a0718_.log (deflated 95%) 2025-12-04T12:25:34.6519266Z adding: test/test-reports/test_model_exports_to_core_aten_1.1_6a0e15a24c6821d5_.log (deflated 52%) 2025-12-04T12:25:34.6520003Z adding: test/test-reports/dynamo.test_precompile_context_1.1_f035b5fe64050d47_.log (deflated 60%) 2025-12-04T12:25:34.6520712Z adding: test/test-reports/dynamo.test_trace_rules_1.1_c52604f69a6cd1ab_.log (deflated 65%) 2025-12-04T12:25:34.6521393Z adding: test/test-reports/export.test_upgrader_1.1_cd3c4ec05ae92aaa_.log (deflated 66%) 2025-12-04T12:25:34.6522516Z adding: test/test-reports/dynamo.test_hooks_1.1_38c972f0c8749aa6_.log (deflated 81%) 2025-12-04T12:25:34.6528761Z adding: test/test-reports/export.test_sparse_1.1_c3722ceed82c1d25_.log (deflated 92%) 2025-12-04T12:25:34.6549605Z adding: test/test-reports/test_modules_1.4_bac8f06793ddacc2_.log (deflated 93%) 2025-12-04T12:25:34.6564698Z adding: test/test-reports/complex_tensor.test_complex_tensor_1.1_dd9d0d0b4ce48ad0_.log (deflated 95%) 2025-12-04T12:25:34.6565491Z adding: test/test-reports/benchmark_utils.test_benchmark_utils_1.1_08587b185fc67fa8_.log (deflated 72%) 2025-12-04T12:25:34.6570891Z adding: test/test-reports/torch_np.numpy_tests.core.test_scalarmath_1.1_9879c80a9340a002_.log (deflated 93%) 2025-12-04T12:25:34.6580851Z adding: test/test-reports/nn.test_convolution_2.2_bfa725818997bc8c_.log (deflated 94%) 2025-12-04T12:25:34.6616877Z ##[group]Run # Remove any previous debugging artifacts if they exist 2025-12-04T12:25:34.6617583Z # Remove any previous debugging artifacts if they exist 2025-12-04T12:25:34.6618108Z rm -f debug-*.zip 2025-12-04T12:25:34.6618396Z if [ -d 'test/debug' ]; then 2025-12-04T12:25:34.6618757Z  zip -r "debug-${FILE_SUFFIX}.zip" test/debug 2025-12-04T12:25:34.6619088Z fi 2025-12-04T12:25:34.6628254Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:34.6628623Z env: 2025-12-04T12:25:34.6628832Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:34.6629111Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:34.6629433Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:34.6629983Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:34.6630477Z DEVICE_NAME: 2025-12-04T12:25:34.6630707Z DEVICE_TYPE: 2025-12-04T12:25:34.6631077Z FILE_SUFFIX: test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206 2025-12-04T12:25:34.6631505Z ##[endgroup] 2025-12-04T12:25:34.6723510Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-12-04T12:25:34.6723849Z with: 2025-12-04T12:25:34.6724259Z s3-bucket: gha-artifacts 2025-12-04T12:25:34.6724923Z s3-prefix: pytorch/pytorch/19922826259/1/artifact 2025-12-04T12:25:34.6725328Z retention-days: 14 2025-12-04T12:25:34.6725595Z if-no-files-found: warn 2025-12-04T12:25:34.6725879Z path: test-jsons-*.zip 2025-12-04T12:25:34.6726145Z name: artifact 2025-12-04T12:25:34.6726378Z region: us-east-1 2025-12-04T12:25:34.6726619Z env: 2025-12-04T12:25:34.6726846Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:34.6727114Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:34.6727448Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:34.6728015Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:34.6728500Z DEVICE_NAME: 2025-12-04T12:25:34.6728740Z DEVICE_TYPE: 2025-12-04T12:25:34.6728975Z ##[endgroup] 2025-12-04T12:25:35.2018490Z NOTE: s3-prefix specified, ignoring name parameter 2025-12-04T12:25:35.2018956Z With the provided path, there will be 1 file uploaded 2025-12-04T12:25:35.2019409Z Uploading to s3 prefix: pytorch/pytorch/19922826259/1/artifact 2025-12-04T12:25:35.2093877Z Starting upload of test-jsons-test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206.zip 2025-12-04T12:25:35.3512848Z Finished upload of test-jsons-test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206.zip 2025-12-04T12:25:35.3832803Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-12-04T12:25:35.3833131Z with: 2025-12-04T12:25:35.3833358Z s3-bucket: gha-artifacts 2025-12-04T12:25:35.3833684Z s3-prefix: pytorch/pytorch/19922826259/1/artifact 2025-12-04T12:25:35.3834026Z retention-days: 14 2025-12-04T12:25:35.3834281Z if-no-files-found: error 2025-12-04T12:25:35.3834562Z path: test-reports-*.zip 2025-12-04T12:25:35.3834823Z name: artifact 2025-12-04T12:25:35.3835048Z region: us-east-1 2025-12-04T12:25:35.3835271Z env: 2025-12-04T12:25:35.3835485Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:35.3835750Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:35.3836092Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:35.3836642Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:35.3837233Z DEVICE_NAME: 2025-12-04T12:25:35.3837463Z DEVICE_TYPE: 2025-12-04T12:25:35.3837685Z ##[endgroup] 2025-12-04T12:25:35.9208343Z NOTE: s3-prefix specified, ignoring name parameter 2025-12-04T12:25:35.9208786Z With the provided path, there will be 1 file uploaded 2025-12-04T12:25:35.9209236Z Uploading to s3 prefix: pytorch/pytorch/19922826259/1/artifact 2025-12-04T12:25:35.9282686Z Starting upload of test-reports-test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206.zip 2025-12-04T12:25:36.0861211Z Finished upload of test-reports-test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206.zip 2025-12-04T12:25:36.1164819Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-12-04T12:25:36.1165152Z with: 2025-12-04T12:25:36.1165375Z s3-bucket: gha-artifacts 2025-12-04T12:25:36.1165825Z s3-prefix: pytorch/pytorch/19922826259/1/artifact 2025-12-04T12:25:36.1166163Z retention-days: 14 2025-12-04T12:25:36.1166423Z if-no-files-found: ignore 2025-12-04T12:25:36.1166702Z path: logs-*.zip 2025-12-04T12:25:36.1166928Z name: artifact 2025-12-04T12:25:36.1167157Z region: us-east-1 2025-12-04T12:25:36.1167381Z env: 2025-12-04T12:25:36.1167587Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:36.1167869Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:36.1168186Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:36.1168734Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:36.1169222Z DEVICE_NAME: 2025-12-04T12:25:36.1169447Z DEVICE_TYPE: 2025-12-04T12:25:36.1169673Z ##[endgroup] 2025-12-04T12:25:36.4466052Z NOTE: s3-prefix specified, ignoring name parameter 2025-12-04T12:25:36.4466665Z With the provided path, there will be 1 file uploaded 2025-12-04T12:25:36.4467270Z Uploading to s3 prefix: pytorch/pytorch/19922826259/1/artifact 2025-12-04T12:25:36.4541297Z Starting upload of logs-test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206.zip 2025-12-04T12:25:36.6099860Z Finished upload of logs-test-default-4-8-linux.g5.4xlarge.nvidia.gpu_57118183206.zip 2025-12-04T12:25:36.6834732Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-12-04T12:25:36.6835058Z with: 2025-12-04T12:25:36.6835279Z s3-bucket: gha-artifacts 2025-12-04T12:25:36.6835601Z s3-prefix: pytorch/pytorch/19922826259/1/artifact 2025-12-04T12:25:36.6835935Z retention-days: 14 2025-12-04T12:25:36.6836191Z if-no-files-found: ignore 2025-12-04T12:25:36.6836461Z path: debug-*.zip 2025-12-04T12:25:36.6836687Z name: artifact 2025-12-04T12:25:36.6836917Z region: us-east-1 2025-12-04T12:25:36.6837148Z env: 2025-12-04T12:25:36.6837354Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:36.6837617Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:36.6837937Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:36.6838499Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:36.6838978Z DEVICE_NAME: 2025-12-04T12:25:36.6839199Z DEVICE_TYPE: 2025-12-04T12:25:36.6839431Z ##[endgroup] 2025-12-04T12:25:37.0065778Z No files were found with the provided path: debug-*.zip. No artifacts will be uploaded. 2025-12-04T12:25:37.0390701Z ##[group]Run # shellcheck disable=SC2156 2025-12-04T12:25:37.0391085Z # shellcheck disable=SC2156 2025-12-04T12:25:37.0391645Z find . -iname "core.[1-9]*" -exec docker exec "${DOCKER_CONTAINER_ID}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2025-12-04T12:25:37.0400986Z shell: /usr/bin/bash -e {0} 2025-12-04T12:25:37.0401260Z env: 2025-12-04T12:25:37.0401477Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:37.0401737Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:37.0402057Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:37.0402608Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:37.0403098Z DEVICE_NAME: 2025-12-04T12:25:37.0403319Z DEVICE_TYPE: 2025-12-04T12:25:37.0403541Z ##[endgroup] 2025-12-04T12:25:37.4758169Z Prepare all required actions 2025-12-04T12:25:37.4758660Z Getting action download info 2025-12-04T12:25:37.6518573Z Download action repository 'actions/setup-python@v6' (SHA:83679a892e2d95755f2dac6acb0bfd1e9ac5d548) 2025-12-04T12:25:39.2014844Z ##[group]Run ./.github/actions/upload-utilization-stats 2025-12-04T12:25:39.2015196Z with: 2025-12-04T12:25:39.2015405Z job_id: 57118183206 2025-12-04T12:25:39.2016066Z job_name: linux-jammy-cuda12.8-py3-gcc11-slow-gradcheck / test (default, 4, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck, mem_leak_check) 2025-12-04T12:25:39.2016763Z workflow_name: periodic 2025-12-04T12:25:39.2017037Z workflow_run_id: 19922826259 2025-12-04T12:25:39.2017316Z workflow_attempt: 1 2025-12-04T12:25:39.2017545Z env: 2025-12-04T12:25:39.2017758Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:39.2018031Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:39.2018471Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:39.2019045Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:39.2019527Z DEVICE_NAME: 2025-12-04T12:25:39.2019758Z DEVICE_TYPE: 2025-12-04T12:25:39.2019981Z ##[endgroup] 2025-12-04T12:25:39.2120359Z ##[group]Run actions/setup-python@v6 2025-12-04T12:25:39.2120653Z with: 2025-12-04T12:25:39.2120873Z python-version: 3.10 2025-12-04T12:25:39.2121129Z check-latest: false 2025-12-04T12:25:39.2121465Z token: *** 2025-12-04T12:25:39.2121704Z update-environment: true 2025-12-04T12:25:39.2121989Z allow-prereleases: false 2025-12-04T12:25:39.2122251Z freethreaded: false 2025-12-04T12:25:39.2122487Z env: 2025-12-04T12:25:39.2122697Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:39.2122956Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:39.2123278Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:39.2123826Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:39.2124625Z DEVICE_NAME: 2025-12-04T12:25:39.2124890Z DEVICE_TYPE: 2025-12-04T12:25:39.2125113Z ##[endgroup] 2025-12-04T12:25:39.4941248Z ##[group]Installed versions 2025-12-04T12:25:39.4952267Z Version 3.10 was not found in the local cache 2025-12-04T12:25:39.5132405Z (node:203147) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead. 2025-12-04T12:25:39.5133503Z (Use `node --trace-deprecation ...` to show where the warning was created) 2025-12-04T12:25:39.8821944Z ##[error]The version '3.10' with architecture 'x64' was not found for this operating system. The list of all available versions can be found here: https://raw.githubusercontent.com/actions/python-versions/main/versions-manifest.json 2025-12-04T12:25:39.9072803Z ##[group]Run pytorch/test-infra/.github/actions/teardown-linux@main 2025-12-04T12:25:39.9073231Z with: 2025-12-04T12:25:39.9073425Z env: 2025-12-04T12:25:39.9073649Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:39.9073916Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:39.9074224Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:39.9074772Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:39.9075246Z DEVICE_NAME: 2025-12-04T12:25:39.9075463Z DEVICE_TYPE: 2025-12-04T12:25:39.9075676Z ##[endgroup] 2025-12-04T12:25:39.9180545Z ##[group]Run set -eou pipefail 2025-12-04T12:25:39.9180858Z set -eou pipefail 2025-12-04T12:25:39.9181115Z  2025-12-04T12:25:39.9181477Z echo "Holding runner for 2 hours until all ssh sessions have logged out" 2025-12-04T12:25:39.9181929Z for _ in $(seq 1440); do 2025-12-04T12:25:39.9182283Z  # Break if no ssh session exists anymore 2025-12-04T12:25:39.9182624Z  if [ "$(who)" = "" ]; then 2025-12-04T12:25:39.9183000Z  break 2025-12-04T12:25:39.9183231Z  fi 2025-12-04T12:25:39.9183457Z  echo "." 2025-12-04T12:25:39.9183691Z  sleep 5 2025-12-04T12:25:39.9183924Z done 2025-12-04T12:25:39.9192951Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:39.9193304Z env: 2025-12-04T12:25:39.9193515Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:39.9193770Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:39.9194079Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:39.9194624Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:39.9195100Z DEVICE_NAME: 2025-12-04T12:25:39.9195315Z DEVICE_TYPE: 2025-12-04T12:25:39.9195536Z ##[endgroup] 2025-12-04T12:25:39.9230047Z Holding runner for 2 hours until all ssh sessions have logged out 2025-12-04T12:25:39.9727116Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T12:25:39.9727679Z # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T12:25:39.9728213Z # shellcheck disable=SC2046 2025-12-04T12:25:39.9728549Z docker stop $(docker ps -q) || true 2025-12-04T12:25:39.9728890Z # Prune all of the docker images 2025-12-04T12:25:39.9729218Z docker system prune -af 2025-12-04T12:25:39.9738331Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:25:39.9738700Z env: 2025-12-04T12:25:39.9738912Z GIT_DEFAULT_BRANCH: main 2025-12-04T12:25:39.9739178Z HAS_NVIDIA_GPU: true 2025-12-04T12:25:39.9739498Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-12-04T12:25:39.9740040Z DOCKER_CONTAINER_ID: 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:25:39.9740521Z DEVICE_NAME: 2025-12-04T12:25:39.9740742Z DEVICE_TYPE: 2025-12-04T12:25:39.9740963Z ##[endgroup] 2025-12-04T12:25:57.1189142Z 9eab885271a8 2025-12-04T12:26:02.8737416Z Deleted Containers: 2025-12-04T12:26:02.8737961Z 9eab885271a8e3762aa85c09a9db3b048e1f7af73c096f58f387f7cb3140ef5e 2025-12-04T12:26:02.8738403Z 2025-12-04T12:26:17.9713384Z Deleted Images: 2025-12-04T12:26:17.9713910Z untagged: public.ecr.aws/docker/library/python:3.13 2025-12-04T12:26:17.9714884Z untagged: public.ecr.aws/docker/library/python@sha256:3f986299a7b8b44b0d8cf9bda2b22361ce5c3058ef5d7cb17fb7452506680ab0 2025-12-04T12:26:17.9715963Z deleted: sha256:44438aecfedf7b6086fce506dae0db5ba7fc0027f9b743f1a75a6b5cbc7de70a 2025-12-04T12:26:17.9716851Z deleted: sha256:6f09a1f5d8a107c2532fbd116e75116cb75fa77b1a7d72d3bdf1ac12de152acd 2025-12-04T12:26:17.9717741Z deleted: sha256:fe5f3ac0be086125eb1e3cd10cc33e8e426f4e079381f7ce5a987b626e99fa67 2025-12-04T12:26:17.9718605Z deleted: sha256:79dd2061a22cf919cfc4f1f02704bfda09afadb017265e670ee54441d296c06c 2025-12-04T12:26:17.9719866Z deleted: sha256:9447ad402aafdbee17e999b0ec84ad89c2646dbebf054d469d4f8bee77f66212 2025-12-04T12:26:17.9720728Z deleted: sha256:7a4909f3c1975be52292f53107495ee1b41c17494918767ccedf1cf1688ae318 2025-12-04T12:26:17.9721544Z deleted: sha256:3474923d97f1f498237650a7d51bd4aea37d5e6b9d8a778777920584af5dd560 2025-12-04T12:26:17.9722389Z deleted: sha256:683afd1773444401a9cbd24842ee5d9154a11abb4fab63ddea5c03df788597ee 2025-12-04T12:26:17.9723720Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T12:26:17.9725695Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image@sha256:ba21003510dba4bdeed83df81a56fa468e0ee1b612a9445ae1f402a280804f97 2025-12-04T12:26:17.9726879Z deleted: sha256:add7313791033822205cdb3cf32096534b2cfaa4855bd48119b59000bfe00301 2025-12-04T12:26:17.9727745Z deleted: sha256:85a76b7bf29ad34eb76cce6f46af5d49a58b6272f80f983d5c769e82c7749301 2025-12-04T12:26:17.9728620Z deleted: sha256:0882f3ce59ff5ae30195ee4b059fc713e13eda107a3a7814a4616ac9058a30a4 2025-12-04T12:26:17.9729472Z deleted: sha256:64ba5b9344c11a3e4729136076830b90ac4cf1554046edb1bd4f0784b66ebd9b 2025-12-04T12:26:17.9730316Z deleted: sha256:88213c59cf461a65ab9b6cb07b4195dc9d41b5241c152daa002c7b3112e09124 2025-12-04T12:26:17.9731187Z deleted: sha256:4c0f83afa802ffbc05ebaf1aa50e48a2447c7c295549a6dded80ac63437906ca 2025-12-04T12:26:17.9732213Z deleted: sha256:6f7ec74460e8fb070c8209949095ea3be5f4e2fd69c9f750cd39ac4093f5e64b 2025-12-04T12:26:17.9733125Z deleted: sha256:d6928b0d1021b31942fdcb64e5eb4a34682de66e959dd424ed6ed02c29cd706d 2025-12-04T12:26:17.9733970Z deleted: sha256:4e9fbcb1705a6351bb34dd320558752614308636b94fd9ae6f26063e3deadc0a 2025-12-04T12:26:17.9734809Z deleted: sha256:43aabd0201f48712f21758071352dea029b4de37be08b2e2197706856a9ecbf2 2025-12-04T12:26:17.9735648Z deleted: sha256:940a98dec78303f0548beb1033242a45e9097607ef3e55c8b949b69b73d1b95e 2025-12-04T12:26:17.9736486Z deleted: sha256:d2849fa0e0411cf66e4408831d70e38838afb55b11a80c1c4d8aa0ae7dc9ca40 2025-12-04T12:26:17.9737335Z deleted: sha256:14f40d23c20c7e562623f89deb376520296758bc39dd3c77284049b84ebd8a31 2025-12-04T12:26:17.9738193Z deleted: sha256:a8ccba61f90ca097cb391d0f4fbed0d9f821d06b00e28f7332e9e2dcfcbac4ca 2025-12-04T12:26:17.9739187Z deleted: sha256:91b2060d290547d3b517d4a11d994bbe23f4560b5546cb91918ca1828dde6be1 2025-12-04T12:26:17.9740019Z deleted: sha256:b42a184755715dcfead7fad655a127433541d316d9628f5f730ff17ad5f8071c 2025-12-04T12:26:17.9740882Z deleted: sha256:aa5b4f3c9169061dc3c6da0e677e8a86f11ecb0a3f9fb4861ab3d8c04379775c 2025-12-04T12:26:17.9741630Z deleted: sha256:b4dcf450081a48d77fea0a21b8d810a69c03608a595e754fe7d365058d0579b7 2025-12-04T12:26:17.9742247Z deleted: sha256:4f7fe12d3d4f5bf890c7ada4ce16f17a105472aa6509a778f917dcce2f28174b 2025-12-04T12:26:17.9742937Z deleted: sha256:2d1d5a74182594f9a8553df00fdcfc809dba407bcd6700d667f862cbe9d555ce 2025-12-04T12:26:17.9743632Z deleted: sha256:d901e2f5d449aeed16b727bdcc11fc0e0f6c30c8fc5c39ac7eeac8a74d9d176c 2025-12-04T12:26:17.9744533Z deleted: sha256:a04df2603bd12372c6632469a9a81ebc4a8d677452c250672b9692884fa6a452 2025-12-04T12:26:17.9745366Z deleted: sha256:f438a6b52273a552dc3820a55c74c53a62a0eae9f2a7d21b37125add7d71639f 2025-12-04T12:26:17.9746123Z deleted: sha256:d4b09517e9518d709ac98b0ae6f8446ec9ac51688253607b1fca67aa2c87b3f4 2025-12-04T12:26:17.9746739Z deleted: sha256:c1fa38335237f5e7263e39d3d3de98215bcfbbb12b826955c02e149bf68efd13 2025-12-04T12:26:17.9747349Z deleted: sha256:c898d20a30de901fca74d7611663b17ab48e1726a11e031e40548ed16ee81877 2025-12-04T12:26:17.9747965Z deleted: sha256:3baceec7096518fcc10696feba551639d698b3145c2fc09cac927bb60c0fd751 2025-12-04T12:26:17.9748586Z deleted: sha256:5245aaaa3d5c3a19f76b9a6c920bd82d1a0ff5289f87c8c109652089709d9b3b 2025-12-04T12:26:17.9749195Z deleted: sha256:f05cc789b95246938c377f474c41187965b89ceac0250e7d5124bec32153f447 2025-12-04T12:26:17.9749806Z deleted: sha256:07ec4fc008de4e7a2c794ec7094cc72e0d287c04c8b2156163aee0bae147fe2d 2025-12-04T12:26:17.9750634Z deleted: sha256:c6302601ad5fde573c1f8c900250478fca7fdc6907d8fd4fae651b94b4d9264d 2025-12-04T12:26:17.9751261Z deleted: sha256:cc5e955ee1dc54931f02606c5ea87aae14f03b5d764492be611480ab041f2882 2025-12-04T12:26:17.9751874Z deleted: sha256:f21c03518996d98452338f4e80bcfd9b139a1dab155f4830be0d3f623035269f 2025-12-04T12:26:17.9752489Z deleted: sha256:519ca6f1279f7886f25f0005527cfa627deebbc5b7d7cdbfa7ef962bcfc4c26d 2025-12-04T12:26:17.9753099Z deleted: sha256:0ef990495216807d0175b192045be3f617e72331bc373b3434807f41bf69168d 2025-12-04T12:26:17.9753705Z deleted: sha256:7093edf7319e1f0e01654c3224e32c8dede5b948d106e0b9b03cbf0bb1091e33 2025-12-04T12:26:17.9754307Z deleted: sha256:c478161e058e2f4041555c3e880b95ee1ee047938dc58549a3a88135740996ae 2025-12-04T12:26:17.9754919Z deleted: sha256:9bb853b0d938cd7c36a80ce8ee40653f2c0ff92719209b11beb03acc8855ce3e 2025-12-04T12:26:17.9755544Z deleted: sha256:fdf2ace71a78ce6910ef9c4b073c195531da47022443b606bb92dcd6499b6afc 2025-12-04T12:26:17.9756172Z deleted: sha256:576c2b3770d871937d3cfb7014328bcb4bd1aed0c28bc438764b3bfdac4c1ac2 2025-12-04T12:26:17.9756802Z deleted: sha256:878e92b9cb82de09ac14a9d5f3f7bc2411a799b6f54d0d64b78c2bb4d1fdc0fc 2025-12-04T12:26:17.9757422Z deleted: sha256:85c8c3b98b65a6695f988a10cc66c981d73a3ef03eda15b8e14d227b50b56300 2025-12-04T12:26:17.9758058Z deleted: sha256:ce2ab3ba07794f9ee95d6ea7de6dcd3d2aed96561f9a79192dd56ca5bf29313a 2025-12-04T12:26:17.9758674Z deleted: sha256:37a6e12976ca957286977e696e63012ab9821214b0483fe1a48d29dcb280508a 2025-12-04T12:26:17.9759338Z deleted: sha256:cd1d5d3dd7038144ca6fe961c0d4c8e705625ae0c36190ba8b3e9602abedad19 2025-12-04T12:26:17.9759961Z deleted: sha256:0e707276e0be2e0008b86d594fadc0d16444d66c4fb7227c56f144cbb3c2affd 2025-12-04T12:26:17.9760595Z deleted: sha256:22d4aad6a2ada91b341c1225a0f314042b8aeabef7568c5c019709b058bf070b 2025-12-04T12:26:17.9761271Z deleted: sha256:ee4adacf4e0933131d0275eddad406b3c8147e6cf07a292b99f1aff4b5355f33 2025-12-04T12:26:17.9761907Z deleted: sha256:43da0b9e7c0e18403dcb834e53628dc7c970ccb2dbd091878c0d7c0170dbc97f 2025-12-04T12:26:17.9762542Z deleted: sha256:00571684bdcd75beda15eb7d4e79b5458bc914350f9bb4d87fcdc97ad15e0da1 2025-12-04T12:26:17.9763166Z deleted: sha256:41615f09950259f1d75e82ef35b6fc53b18fe71ebff143744cfd51009d04349e 2025-12-04T12:26:17.9763837Z deleted: sha256:75ab34d2eed3c7915467a506ab6dab2711918fbabe94add2fb5c62780221ab0c 2025-12-04T12:26:17.9764472Z deleted: sha256:0a39ef2bebf44c1c3893d1e5fb42dad48b8fac7ca673141267ee967f85455e89 2025-12-04T12:26:17.9765109Z deleted: sha256:9b7d024e48ba1f9824a54597621b1b062cbc4aa41a77d81ca538d6b5c24a612c 2025-12-04T12:26:17.9765716Z deleted: sha256:392257172de6434c271bd93394218a91e9aa86d7c18abc2f2759317b9d5fb6de 2025-12-04T12:26:17.9766311Z deleted: sha256:6c3232860b930866a463a356124fc392c7e5f04895695229257e8c3e8a02711d 2025-12-04T12:26:17.9766913Z deleted: sha256:63dd55b807215e2fa6c715419ac0c5072d02dddc848dbf74bb7e77b906b5eaed 2025-12-04T12:26:17.9767531Z deleted: sha256:07a8738c1b4584db72ed9aa60f5274321eb0ba16263450da3a75df8326ebc25f 2025-12-04T12:26:17.9768134Z deleted: sha256:053fe2965b01281d12040ec1893e0d1aa77362a49ea9a1067402272c69dad9f5 2025-12-04T12:26:17.9768749Z deleted: sha256:7857fb5eb181c4e80262ecab60bdd3c266cf3d1409ceb76c05882609b416a8d3 2025-12-04T12:26:17.9769373Z deleted: sha256:752528477fc99089de3bd2c6da7b30cf34f2e901fe06d8fcfe685b411461e883 2025-12-04T12:26:17.9769997Z deleted: sha256:cce0210e2f4b042601813df03aa294a86b0c668fcfc75f4c63f6fa12b2952e15 2025-12-04T12:26:17.9770621Z deleted: sha256:f2bb405a26705ecd12d21380d26d9355d01db3a2175080fbdb468f2b5a25a76c 2025-12-04T12:26:17.9771263Z deleted: sha256:ad430120d4ffbaf97cd8d6de6ea8eefa4a8f80ec45f0b176c6b26bff0970fd33 2025-12-04T12:26:17.9771900Z deleted: sha256:225a4910baea7cc540ed43eeac75046293800ab0b8e0192b51e991c8cb50bcf3 2025-12-04T12:26:17.9772539Z deleted: sha256:a259945b0c3507f049fbac10fb3d3ffe43d45e83c91b80ae8cd1dafb855ad83c 2025-12-04T12:26:17.9773199Z deleted: sha256:862a98881b1d5adad5c21d01602773b894794097de80964ef8f47bcaadb43255 2025-12-04T12:26:17.9773813Z deleted: sha256:1cf6d3c8b6c2694b79a2d08719594903811c330a36a4c7a8a7153a350b53d292 2025-12-04T12:26:17.9774518Z deleted: sha256:232a1ae8b0fee817ff7838bb5986a2f38377d3b1dbbf5217b576af0f953b0844 2025-12-04T12:26:17.9775155Z deleted: sha256:c72c5705dabd6314423dd7d4fb260a20d5d9886b2ebce60d19e9d78c4a2335c2 2025-12-04T12:26:17.9775850Z deleted: sha256:296734cf81fd92c913884d058908598424ffe072676e38de289bbab83768c7bd 2025-12-04T12:26:17.9776451Z deleted: sha256:7c76040481b889847a1804021aeff07547eaa4ee706d6137db218d497a8fd9c1 2025-12-04T12:26:17.9777081Z deleted: sha256:d5e293f5b354e8cbcc6de893ea72cc632b02d8fdfbb08ec3127c4e9662f3ebff 2025-12-04T12:26:17.9777702Z deleted: sha256:f35a64e429c88e249645090f21fbe7dae108d98e0ab4ea13184f24b3fd66c315 2025-12-04T12:26:17.9778325Z deleted: sha256:ce6ae8d595c8e69115c51b1ce4f9a9158795d7b863b1cb53f21c39a87974d41b 2025-12-04T12:26:17.9778953Z deleted: sha256:8941abaee59400fb9b3a60765fea4a1fc2a6a447467a6d983e84c7f72494a450 2025-12-04T12:26:17.9779580Z deleted: sha256:ef53c29a9a2c2bc80ffdb9bfaf92842436b5755ec1ce828b9d11e5e27d656ea1 2025-12-04T12:26:17.9780224Z deleted: sha256:7a347fb0acb43f1c814f8c8ff21185e8b5cf64d7bc5988cea060f77d906e08b5 2025-12-04T12:26:17.9780856Z deleted: sha256:cc855dc9be79496e15175569dced2d13477e50b077a5fd3945f9bf50018880c1 2025-12-04T12:26:17.9781480Z deleted: sha256:f7a9946ada3d4786658bc0b643808bb32a9a45e4e90e30dc43ee19e2dbe24024 2025-12-04T12:26:17.9782093Z deleted: sha256:c22a9215f62812c1d2e32827f5221ff556c5b6702aadbdab6b87b8293f19635e 2025-12-04T12:26:17.9782883Z deleted: sha256:959a56746620012e37c1def1a83c5afb1e7c0adc59b021a28beb53c24df98032 2025-12-04T12:26:17.9783529Z deleted: sha256:31a0fff0695bf6100c17954be72eab2095b466d559c75c3faf2a17d8c41e6ebe 2025-12-04T12:26:17.9784139Z deleted: sha256:c15e2b5241b9e55af1b2593e544391b4b44d0505e6528e8f12425136e93b424c 2025-12-04T12:26:17.9784751Z deleted: sha256:73974f74b436f39a2fdb6461b1e3f7c3e41c73325776fa71d16b942a5b4a365b 2025-12-04T12:26:17.9785117Z 2025-12-04T12:26:17.9785236Z Total reclaimed space: 38GB 2025-12-04T12:26:17.9871954Z Post job cleanup. 2025-12-04T12:26:17.9908410Z Post job cleanup. 2025-12-04T12:26:18.1236026Z (node:203248) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead. 2025-12-04T12:26:18.1237118Z (Use `node --trace-deprecation ...` to show where the warning was created) 2025-12-04T12:26:18.1437784Z Post job cleanup. 2025-12-04T12:26:18.1482480Z Post job cleanup. 2025-12-04T12:26:18.2512320Z [command]/usr/bin/git version 2025-12-04T12:26:18.2577379Z git version 2.50.1 2025-12-04T12:26:18.2615865Z Copying '/home/ec2-user/.gitconfig' to '/home/ec2-user/actions-runner/_work/_temp/5249b407-cd7d-4bc4-afb5-845bc2626980/.gitconfig' 2025-12-04T12:26:18.2626359Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/5249b407-cd7d-4bc4-afb5-845bc2626980' before making global git config changes 2025-12-04T12:26:18.2627279Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T12:26:18.2632029Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-12-04T12:26:18.2682472Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T12:26:18.2732987Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T12:26:18.3145626Z Entering 'android/libs/fbjni' 2025-12-04T12:26:18.3227238Z Entering 'third_party/FP16' 2025-12-04T12:26:18.3312177Z Entering 'third_party/FXdiv' 2025-12-04T12:26:18.3398055Z Entering 'third_party/NNPACK' 2025-12-04T12:26:18.3474833Z Entering 'third_party/NVTX' 2025-12-04T12:26:18.3556517Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:26:18.3636872Z Entering 'third_party/XNNPACK' 2025-12-04T12:26:18.3730951Z Entering 'third_party/aiter' 2025-12-04T12:26:18.3816045Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:26:18.3903884Z Entering 'third_party/benchmark' 2025-12-04T12:26:18.3982394Z Entering 'third_party/composable_kernel' 2025-12-04T12:26:18.4079426Z Entering 'third_party/cpp-httplib' 2025-12-04T12:26:18.4158021Z Entering 'third_party/cpuinfo' 2025-12-04T12:26:18.4243995Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:26:18.4325046Z Entering 'third_party/cutlass' 2025-12-04T12:26:18.4414114Z Entering 'third_party/fbgemm' 2025-12-04T12:26:18.4498410Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:26:18.4577558Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:26:18.4665713Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:26:18.5080958Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:26:18.5167678Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:26:18.5248567Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:26:18.5325426Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:26:18.5410837Z Entering 'third_party/flash-attention' 2025-12-04T12:26:18.5491484Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:26:18.5576342Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:26:18.5664688Z Entering 'third_party/flatbuffers' 2025-12-04T12:26:18.5749077Z Entering 'third_party/fmt' 2025-12-04T12:26:18.5829076Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:26:18.5920737Z Entering 'third_party/gloo' 2025-12-04T12:26:18.5991821Z Entering 'third_party/googletest' 2025-12-04T12:26:18.6072339Z Entering 'third_party/ideep' 2025-12-04T12:26:18.6153477Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:26:18.6240898Z Entering 'third_party/ittapi' 2025-12-04T12:26:18.6320175Z Entering 'third_party/kineto' 2025-12-04T12:26:18.6400949Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:26:18.6478975Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:26:18.6558896Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:26:18.6640529Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:26:18.6719228Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:26:18.6797468Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:26:18.6879861Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:26:18.6957690Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:26:18.7039410Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:26:18.7119363Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:26:18.7198669Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:26:18.7275043Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:26:18.7356231Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:26:18.7444455Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:26:18.7519773Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:26:18.7604905Z Entering 'third_party/kleidiai' 2025-12-04T12:26:18.7687033Z Entering 'third_party/mimalloc' 2025-12-04T12:26:18.7769305Z Entering 'third_party/nlohmann' 2025-12-04T12:26:18.7851250Z Entering 'third_party/onnx' 2025-12-04T12:26:18.7950485Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:26:18.8036649Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:26:18.8117747Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:26:18.8194882Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:26:18.8272766Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:26:18.8354254Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:26:18.8431324Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:26:18.8511764Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:26:18.8587923Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:26:18.8667543Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:26:18.8747076Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:26:18.8828032Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:26:18.8937132Z Entering 'third_party/pocketfft' 2025-12-04T12:26:18.9022312Z Entering 'third_party/protobuf' 2025-12-04T12:26:18.9104557Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:26:18.9180943Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:26:18.9264596Z Entering 'third_party/psimd' 2025-12-04T12:26:18.9343992Z Entering 'third_party/pthreadpool' 2025-12-04T12:26:18.9425434Z Entering 'third_party/pybind11' 2025-12-04T12:26:18.9504400Z Entering 'third_party/python-peachpy' 2025-12-04T12:26:18.9583454Z Entering 'third_party/sleef' 2025-12-04T12:26:18.9663213Z Entering 'third_party/tensorpipe' 2025-12-04T12:26:18.9742296Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:26:18.9819362Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:26:18.9897148Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:26:18.9976913Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:26:19.0056717Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:26:19.0170308Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T12:26:19.0199242Z http.https://github.com/.extraheader 2025-12-04T12:26:19.0211803Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-12-04T12:26:19.0252147Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T12:26:19.0657460Z Entering 'android/libs/fbjni' 2025-12-04T12:26:19.0712002Z http.https://github.com/.extraheader 2025-12-04T12:26:19.0761563Z Entering 'third_party/FP16' 2025-12-04T12:26:19.0815527Z http.https://github.com/.extraheader 2025-12-04T12:26:19.0866419Z Entering 'third_party/FXdiv' 2025-12-04T12:26:19.0920463Z http.https://github.com/.extraheader 2025-12-04T12:26:19.0972819Z Entering 'third_party/NNPACK' 2025-12-04T12:26:19.1026145Z http.https://github.com/.extraheader 2025-12-04T12:26:19.1077903Z Entering 'third_party/NVTX' 2025-12-04T12:26:19.1132113Z http.https://github.com/.extraheader 2025-12-04T12:26:19.1186032Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:26:19.1237726Z http.https://github.com/.extraheader 2025-12-04T12:26:19.1288773Z Entering 'third_party/XNNPACK' 2025-12-04T12:26:19.1341029Z http.https://github.com/.extraheader 2025-12-04T12:26:19.1405937Z Entering 'third_party/aiter' 2025-12-04T12:26:19.1461642Z http.https://github.com/.extraheader 2025-12-04T12:26:19.1511837Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:26:19.1561981Z http.https://github.com/.extraheader 2025-12-04T12:26:19.1622718Z Entering 'third_party/benchmark' 2025-12-04T12:26:19.1675838Z http.https://github.com/.extraheader 2025-12-04T12:26:19.1726256Z Entering 'third_party/composable_kernel' 2025-12-04T12:26:19.1780554Z http.https://github.com/.extraheader 2025-12-04T12:26:19.1842291Z Entering 'third_party/cpp-httplib' 2025-12-04T12:26:19.1893939Z http.https://github.com/.extraheader 2025-12-04T12:26:19.1946649Z Entering 'third_party/cpuinfo' 2025-12-04T12:26:19.1998921Z http.https://github.com/.extraheader 2025-12-04T12:26:19.2049339Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:26:19.2102004Z http.https://github.com/.extraheader 2025-12-04T12:26:19.2154303Z Entering 'third_party/cutlass' 2025-12-04T12:26:19.2206910Z http.https://github.com/.extraheader 2025-12-04T12:26:19.2270252Z Entering 'third_party/fbgemm' 2025-12-04T12:26:19.2321452Z http.https://github.com/.extraheader 2025-12-04T12:26:19.2373250Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:26:19.2422500Z http.https://github.com/.extraheader 2025-12-04T12:26:19.2472263Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:26:19.2527873Z http.https://github.com/.extraheader 2025-12-04T12:26:19.2588325Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:26:19.2638293Z http.https://github.com/.extraheader 2025-12-04T12:26:19.2687230Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:26:19.2740954Z http.https://github.com/.extraheader 2025-12-04T12:26:19.2800124Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:26:19.2850848Z http.https://github.com/.extraheader 2025-12-04T12:26:19.2899653Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:26:19.2949969Z http.https://github.com/.extraheader 2025-12-04T12:26:19.2999195Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:26:19.3050539Z http.https://github.com/.extraheader 2025-12-04T12:26:19.3106675Z Entering 'third_party/flash-attention' 2025-12-04T12:26:19.3158252Z http.https://github.com/.extraheader 2025-12-04T12:26:19.3208059Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:26:19.3261406Z http.https://github.com/.extraheader 2025-12-04T12:26:19.3318652Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:26:19.3369391Z http.https://github.com/.extraheader 2025-12-04T12:26:19.3431714Z Entering 'third_party/flatbuffers' 2025-12-04T12:26:19.3483410Z http.https://github.com/.extraheader 2025-12-04T12:26:19.3537551Z Entering 'third_party/fmt' 2025-12-04T12:26:19.3589566Z http.https://github.com/.extraheader 2025-12-04T12:26:19.3639253Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:26:19.3689898Z http.https://github.com/.extraheader 2025-12-04T12:26:19.3739327Z Entering 'third_party/gloo' 2025-12-04T12:26:19.3792022Z http.https://github.com/.extraheader 2025-12-04T12:26:19.3843829Z Entering 'third_party/googletest' 2025-12-04T12:26:19.3895158Z http.https://github.com/.extraheader 2025-12-04T12:26:19.3946254Z Entering 'third_party/ideep' 2025-12-04T12:26:19.3996867Z http.https://github.com/.extraheader 2025-12-04T12:26:19.4047665Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:26:19.4097023Z http.https://github.com/.extraheader 2025-12-04T12:26:19.4155321Z Entering 'third_party/ittapi' 2025-12-04T12:26:19.4205757Z http.https://github.com/.extraheader 2025-12-04T12:26:19.4255588Z Entering 'third_party/kineto' 2025-12-04T12:26:19.4308941Z http.https://github.com/.extraheader 2025-12-04T12:26:19.4357171Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:26:19.4407714Z http.https://github.com/.extraheader 2025-12-04T12:26:19.4456474Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:26:19.4506288Z http.https://github.com/.extraheader 2025-12-04T12:26:19.4561084Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:26:19.4611297Z http.https://github.com/.extraheader 2025-12-04T12:26:19.4661312Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:26:19.4711835Z http.https://github.com/.extraheader 2025-12-04T12:26:19.4761795Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:26:19.4812955Z http.https://github.com/.extraheader 2025-12-04T12:26:19.4859753Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:26:19.4910401Z http.https://github.com/.extraheader 2025-12-04T12:26:19.4965723Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:26:19.5016799Z http.https://github.com/.extraheader 2025-12-04T12:26:19.5072298Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:26:19.5122394Z http.https://github.com/.extraheader 2025-12-04T12:26:19.5173980Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:26:19.5229227Z http.https://github.com/.extraheader 2025-12-04T12:26:19.5281242Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:26:19.5331666Z http.https://github.com/.extraheader 2025-12-04T12:26:19.5380945Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:26:19.5431779Z http.https://github.com/.extraheader 2025-12-04T12:26:19.5478801Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:26:19.5531006Z http.https://github.com/.extraheader 2025-12-04T12:26:19.5584560Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:26:19.5635035Z http.https://github.com/.extraheader 2025-12-04T12:26:19.5693926Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:26:19.5745252Z http.https://github.com/.extraheader 2025-12-04T12:26:19.5794597Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:26:19.5845842Z http.https://github.com/.extraheader 2025-12-04T12:26:19.5900174Z Entering 'third_party/kleidiai' 2025-12-04T12:26:19.5951843Z http.https://github.com/.extraheader 2025-12-04T12:26:19.6001358Z Entering 'third_party/mimalloc' 2025-12-04T12:26:19.6052518Z http.https://github.com/.extraheader 2025-12-04T12:26:19.6100600Z Entering 'third_party/nlohmann' 2025-12-04T12:26:19.6151712Z http.https://github.com/.extraheader 2025-12-04T12:26:19.6204755Z Entering 'third_party/onnx' 2025-12-04T12:26:19.6255479Z http.https://github.com/.extraheader 2025-12-04T12:26:19.6325213Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:26:19.6376499Z http.https://github.com/.extraheader 2025-12-04T12:26:19.6432664Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:26:19.6484094Z http.https://github.com/.extraheader 2025-12-04T12:26:19.6537024Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:26:19.6590107Z http.https://github.com/.extraheader 2025-12-04T12:26:19.6638532Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:26:19.6687740Z http.https://github.com/.extraheader 2025-12-04T12:26:19.6736639Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:26:19.6786518Z http.https://github.com/.extraheader 2025-12-04T12:26:19.6837327Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:26:19.6886173Z http.https://github.com/.extraheader 2025-12-04T12:26:19.6936043Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:26:19.6985827Z http.https://github.com/.extraheader 2025-12-04T12:26:19.7036599Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:26:19.7089413Z http.https://github.com/.extraheader 2025-12-04T12:26:19.7138671Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:26:19.7188606Z http.https://github.com/.extraheader 2025-12-04T12:26:19.7235456Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:26:19.7288928Z http.https://github.com/.extraheader 2025-12-04T12:26:19.7340823Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:26:19.7391620Z http.https://github.com/.extraheader 2025-12-04T12:26:19.7445802Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:26:19.7496078Z http.https://github.com/.extraheader 2025-12-04T12:26:19.7572823Z Entering 'third_party/pocketfft' 2025-12-04T12:26:19.7628948Z http.https://github.com/.extraheader 2025-12-04T12:26:19.7679843Z Entering 'third_party/protobuf' 2025-12-04T12:26:19.7731790Z http.https://github.com/.extraheader 2025-12-04T12:26:19.7782710Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:26:19.7833016Z http.https://github.com/.extraheader 2025-12-04T12:26:19.7881485Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:26:19.7932684Z http.https://github.com/.extraheader 2025-12-04T12:26:19.7985791Z Entering 'third_party/psimd' 2025-12-04T12:26:19.8036930Z http.https://github.com/.extraheader 2025-12-04T12:26:19.8084844Z Entering 'third_party/pthreadpool' 2025-12-04T12:26:19.8136380Z http.https://github.com/.extraheader 2025-12-04T12:26:19.8185623Z Entering 'third_party/pybind11' 2025-12-04T12:26:19.8236872Z http.https://github.com/.extraheader 2025-12-04T12:26:19.8285870Z Entering 'third_party/python-peachpy' 2025-12-04T12:26:19.8337210Z http.https://github.com/.extraheader 2025-12-04T12:26:19.8386746Z Entering 'third_party/sleef' 2025-12-04T12:26:19.8440322Z http.https://github.com/.extraheader 2025-12-04T12:26:19.8488940Z Entering 'third_party/tensorpipe' 2025-12-04T12:26:19.8540169Z http.https://github.com/.extraheader 2025-12-04T12:26:19.8588134Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:26:19.8638598Z http.https://github.com/.extraheader 2025-12-04T12:26:19.8687050Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:26:19.8736792Z http.https://github.com/.extraheader 2025-12-04T12:26:19.8785237Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:26:19.8837746Z http.https://github.com/.extraheader 2025-12-04T12:26:19.8887163Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:26:19.8939643Z http.https://github.com/.extraheader 2025-12-04T12:26:19.8986528Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:26:19.9036606Z http.https://github.com/.extraheader 2025-12-04T12:26:19.9121503Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:19.9160937Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T12:26:19.9574988Z Entering 'android/libs/fbjni' 2025-12-04T12:26:19.9610364Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T12:26:19.9636950Z Entering 'third_party/FP16' 2025-12-04T12:26:19.9672342Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T12:26:19.9696677Z Entering 'third_party/FXdiv' 2025-12-04T12:26:19.9734074Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T12:26:19.9758163Z Entering 'third_party/NNPACK' 2025-12-04T12:26:19.9792902Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T12:26:19.9822231Z Entering 'third_party/NVTX' 2025-12-04T12:26:19.9858533Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T12:26:19.9883108Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T12:26:19.9916361Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T12:26:19.9942333Z Entering 'third_party/XNNPACK' 2025-12-04T12:26:19.9977702Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T12:26:20.0019087Z Entering 'third_party/aiter' 2025-12-04T12:26:20.0055889Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T12:26:20.0080450Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T12:26:20.0115102Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T12:26:20.0150990Z Entering 'third_party/benchmark' 2025-12-04T12:26:20.0186669Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:26:20.0209383Z Entering 'third_party/composable_kernel' 2025-12-04T12:26:20.0243975Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T12:26:20.0278591Z Entering 'third_party/cpp-httplib' 2025-12-04T12:26:20.0312918Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T12:26:20.0338574Z Entering 'third_party/cpuinfo' 2025-12-04T12:26:20.0373300Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T12:26:20.0400050Z Entering 'third_party/cudnn_frontend' 2025-12-04T12:26:20.0434747Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T12:26:20.0460361Z Entering 'third_party/cutlass' 2025-12-04T12:26:20.0497139Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T12:26:20.0531516Z Entering 'third_party/fbgemm' 2025-12-04T12:26:20.0566487Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T12:26:20.0594525Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T12:26:20.0628838Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T12:26:20.0652321Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T12:26:20.0686347Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T12:26:20.0717940Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T12:26:20.0751343Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T12:26:20.0775734Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T12:26:20.0807549Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T12:26:20.0841422Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T12:26:20.0875677Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T12:26:20.0900107Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T12:26:20.0933212Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T12:26:20.0956601Z Entering 'third_party/fbgemm/external/json' 2025-12-04T12:26:20.0990767Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T12:26:20.1021046Z Entering 'third_party/flash-attention' 2025-12-04T12:26:20.1059278Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T12:26:20.1083805Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T12:26:20.1117862Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T12:26:20.1150265Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T12:26:20.1183854Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T12:26:20.1218450Z Entering 'third_party/flatbuffers' 2025-12-04T12:26:20.1254288Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T12:26:20.1282522Z Entering 'third_party/fmt' 2025-12-04T12:26:20.1316738Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T12:26:20.1343801Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T12:26:20.1379005Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T12:26:20.1402952Z Entering 'third_party/gloo' 2025-12-04T12:26:20.1436824Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T12:26:20.1462603Z Entering 'third_party/googletest' 2025-12-04T12:26:20.1496734Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:26:20.1522060Z Entering 'third_party/ideep' 2025-12-04T12:26:20.1558164Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T12:26:20.1580456Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T12:26:20.1615662Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T12:26:20.1651424Z Entering 'third_party/ittapi' 2025-12-04T12:26:20.1685816Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T12:26:20.1712573Z Entering 'third_party/kineto' 2025-12-04T12:26:20.1746881Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T12:26:20.1770901Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T12:26:20.1805101Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T12:26:20.1826233Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T12:26:20.1860367Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T12:26:20.1885713Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T12:26:20.1920337Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T12:26:20.1945737Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T12:26:20.1978841Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T12:26:20.2003789Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T12:26:20.2039173Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T12:26:20.2061707Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T12:26:20.2095929Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T12:26:20.2122852Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T12:26:20.2166455Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T12:26:20.2192264Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T12:26:20.2226471Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:26:20.2250477Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T12:26:20.2283133Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T12:26:20.2309616Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T12:26:20.2347021Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T12:26:20.2372734Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T12:26:20.2412567Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T12:26:20.2436079Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:26:20.2470311Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T12:26:20.2496535Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:26:20.2530803Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T12:26:20.2562674Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T12:26:20.2594953Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T12:26:20.2617855Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T12:26:20.2650366Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T12:26:20.2678735Z Entering 'third_party/kleidiai' 2025-12-04T12:26:20.2713431Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T12:26:20.2738442Z Entering 'third_party/mimalloc' 2025-12-04T12:26:20.2771286Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T12:26:20.2801162Z Entering 'third_party/nlohmann' 2025-12-04T12:26:20.2835475Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T12:26:20.2862148Z Entering 'third_party/onnx' 2025-12-04T12:26:20.2896410Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T12:26:20.2939845Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T12:26:20.2972949Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:26:20.3003394Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T12:26:20.3040754Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T12:26:20.3066318Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T12:26:20.3098518Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:26:20.3121471Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T12:26:20.3153830Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:26:20.3177210Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T12:26:20.3211018Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T12:26:20.3234614Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T12:26:20.3267221Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T12:26:20.3291478Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T12:26:20.3324491Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T12:26:20.3348748Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T12:26:20.3381183Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T12:26:20.3404256Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T12:26:20.3438761Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T12:26:20.3459599Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T12:26:20.3492201Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T12:26:20.3520429Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T12:26:20.3554195Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T12:26:20.3582139Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T12:26:20.3614612Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T12:26:20.3665134Z Entering 'third_party/pocketfft' 2025-12-04T12:26:20.3703501Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T12:26:20.3727646Z Entering 'third_party/protobuf' 2025-12-04T12:26:20.3760942Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T12:26:20.3788539Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T12:26:20.3822368Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T12:26:20.3846435Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T12:26:20.3878732Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:26:20.3907635Z Entering 'third_party/psimd' 2025-12-04T12:26:20.3945106Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T12:26:20.3970469Z Entering 'third_party/pthreadpool' 2025-12-04T12:26:20.4005547Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T12:26:20.4029626Z Entering 'third_party/pybind11' 2025-12-04T12:26:20.4062256Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:26:20.4091287Z Entering 'third_party/python-peachpy' 2025-12-04T12:26:20.4125987Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T12:26:20.4151228Z Entering 'third_party/sleef' 2025-12-04T12:26:20.4185625Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T12:26:20.4211327Z Entering 'third_party/tensorpipe' 2025-12-04T12:26:20.4244726Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T12:26:20.4269405Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T12:26:20.4302183Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T12:26:20.4331706Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T12:26:20.4365616Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T12:26:20.4389795Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T12:26:20.4422297Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T12:26:20.4447288Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T12:26:20.4480225Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T12:26:20.4502156Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T12:26:20.4541302Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T12:26:20.4597835Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.4634233Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.4666955Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.4700364Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.4734063Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.4767440Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.4801914Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.4834295Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.4867778Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.4900395Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.4934529Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.4967295Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5001530Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5034451Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5068113Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5100040Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5134322Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5166883Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5199791Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5233245Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5266643Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5300504Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5340522Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5372620Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5405782Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5438646Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5470897Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5502531Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5538924Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5571149Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5603187Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5635574Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5668386Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5703811Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5737874Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5772329Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5804520Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5841532Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5874435Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5907166Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5940538Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.5972673Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6006613Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6038649Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6072052Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6107844Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6150847Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6183898Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6216885Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6249896Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6282417Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6314608Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6349444Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6380248Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6413303Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6445735Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6478501Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6510915Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6543108Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6576110Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6610295Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6644429Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6677104Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6707686Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6738218Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6767768Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6800091Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6835151Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6870117Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6902170Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6938721Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.6971520Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.7004357Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.7037781Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.7070784Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.7103395Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.7137866Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.7169522Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.7202722Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.7236685Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.7270589Z [command]/usr/bin/git config --file /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T12:26:20.7411514Z A job completed hook has been configured by the self-hosted runner administrator 2025-12-04T12:26:20.7431651Z ##[group]Run '/home/ec2-user/runner-scripts/after_job.sh' 2025-12-04T12:26:20.7440064Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T12:26:20.7440437Z ##[endgroup] 2025-12-04T12:26:20.7558942Z [!ALERT!] Swap in detected! [!ALERT!] 2025-12-04T12:26:39.8789072Z Cleaning up orphan processes