2025-09-07T07:45:24.7977820Z Current runner version: '2.328.0' 2025-09-07T07:45:24.7983329Z Runner name: 'i-0d73070610f53945f-1004' 2025-09-07T07:45:24.7984180Z Runner group name: 'default' 2025-09-07T07:45:24.7985207Z Machine name: '92d046649eb1' 2025-09-07T07:45:24.7987766Z ##[group]GITHUB_TOKEN Permissions 2025-09-07T07:45:24.7989690Z Contents: read 2025-09-07T07:45:24.7990359Z Metadata: read 2025-09-07T07:45:24.7990839Z ##[endgroup] 2025-09-07T07:45:24.7992938Z Secret source: Actions 2025-09-07T07:45:24.7993764Z Prepare workflow directory 2025-09-07T07:45:24.8563905Z Prepare all required actions 2025-09-07T07:45:24.8598888Z Getting action download info 2025-09-07T07:45:25.1507400Z Download action repository 'pytorch/test-infra@main' (SHA:548a4bc624d43a01cdf165a63b041f0ae014ddbd) 2025-09-07T07:46:29.5986881Z Download action repository 'pytorch/pytorch@main' (SHA:ada43ed39c80b746b4822c92640a1882619e2795) 2025-09-07T07:50:37.6712015Z Download action repository 'actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065' (SHA:a26af69be951a213d495a4c3e4e4022e16d87065) 2025-09-07T07:50:39.4028360Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722) 2025-09-07T07:50:40.2895658Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2025-09-07T07:50:40.8724544Z Download action repository 'seemethere/upload-artifact-s3@baba72d0712b404f646cebe0730933554ebce96a' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-09-07T07:50:41.7979678Z Getting action download info 2025-09-07T07:50:41.9443889Z Download action repository 'actions/checkout@v4' (SHA:08eba0b27e820071cde6df949e0beb9ba4906955) 2025-09-07T07:50:42.9632031Z Getting action download info 2025-09-07T07:50:43.0709205Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2025-09-07T07:50:43.6709151Z Getting action download info 2025-09-07T07:50:43.7797920Z Download action repository 'nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482' (SHA:3e91a01664abd3c5cd539100d10d33b9c5b68482) 2025-09-07T07:50:44.2791891Z Getting action download info 2025-09-07T07:50:44.4969054Z Uses: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/heads/main (93fb23d6fae7c4e82c4239a1033e522088742634) 2025-09-07T07:50:44.4972602Z ##[group] Inputs 2025-09-07T07:50:44.4972926Z build-environment: linux-jammy-cuda12.8-py3.10-gcc9-sm90 2025-09-07T07:50:44.4978165Z test-matrix: {"include": [{"config": "inductor_huggingface_perf_cuda_h100", "shard": 1, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 2, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 3, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 4, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 5, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 1, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 2, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 3, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 4, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 5, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 6, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 7, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 1, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 2, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 3, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 4, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 5, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 6, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 7, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 8, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 9, "num_shards": 9, "runner": "linux.aws.h100"}]} 2025-09-07T07:50:44.4983704Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:50:44.4984393Z sync-tag: 2025-09-07T07:50:44.4985281Z timeout-minutes: 1440 2025-09-07T07:50:44.4985507Z use-gha: 2025-09-07T07:50:44.4986366Z dashboard-tag: training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true 2025-09-07T07:50:44.4987300Z s3-bucket: gha-artifacts 2025-09-07T07:50:44.4987509Z aws-role-to-assume: 2025-09-07T07:50:44.4988028Z disable-monitor: false 2025-09-07T07:50:44.4988277Z monitor-log-interval: 15 2025-09-07T07:50:44.4988506Z monitor-data-collect-interval: 4 2025-09-07T07:50:44.4988752Z ##[endgroup] 2025-09-07T07:50:44.4989061Z Complete job name: test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T07:50:44.5845573Z ##[group]Run pytorch/test-infra/.github/actions/setup-ssh@main 2025-09-07T07:50:44.5846236Z with: 2025-09-07T07:50:44.5846711Z github-secret: *** 2025-09-07T07:50:44.5847251Z instructions: All testing is done inside the container, to start an interactive session run: docker exec -it $(docker container ps --format '{{.ID}}') bash 2025-09-07T07:50:44.5847811Z activate-with-label: false 2025-09-07T07:50:44.5848016Z label: with-ssh 2025-09-07T07:50:44.5848200Z remove-existing-keys: true 2025-09-07T07:50:44.5848410Z fail-silently: true 2025-09-07T07:50:44.5848813Z env: 2025-09-07T07:50:44.5848999Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:50:44.5849206Z ##[endgroup] 2025-09-07T07:50:44.6907548Z Please see https://github.com/pytorch/pytorch/wiki/Debugging-using-with-ssh-for-Github-Actions for more info. 2025-09-07T07:50:44.6908282Z Not on pull request and ciflow reference could not be extracted, skipping adding ssh keys 2025-09-07T07:50:44.7138020Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2025-09-07T07:50:44.7138391Z with: 2025-09-07T07:50:44.7138558Z no-sudo: true 2025-09-07T07:50:44.7138737Z submodules: recursive 2025-09-07T07:50:44.7138937Z fetch-depth: 0 2025-09-07T07:50:44.7139108Z env: 2025-09-07T07:50:44.7139272Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:50:44.7139480Z ##[endgroup] 2025-09-07T07:50:44.8525181Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T07:50:44.8525966Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T07:50:44.8545836Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:50:44.8546152Z env: 2025-09-07T07:50:44.8546334Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:50:44.8546582Z ##[endgroup] 2025-09-07T07:50:44.9908817Z ##[group]Run actions/checkout@v4 2025-09-07T07:50:44.9909039Z with: 2025-09-07T07:50:44.9909226Z ref: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:50:44.9909470Z fetch-depth: 0 2025-09-07T07:50:44.9909646Z submodules: recursive 2025-09-07T07:50:44.9909831Z show-progress: false 2025-09-07T07:50:44.9910026Z repository: pytorch/pytorch 2025-09-07T07:50:44.9910317Z token: *** 2025-09-07T07:50:44.9910710Z ssh-strict: true 2025-09-07T07:50:44.9910889Z ssh-user: git 2025-09-07T07:50:44.9911065Z persist-credentials: true 2025-09-07T07:50:44.9911263Z clean: true 2025-09-07T07:50:44.9911443Z sparse-checkout-cone-mode: true 2025-09-07T07:50:44.9911659Z fetch-tags: false 2025-09-07T07:50:44.9911822Z lfs: false 2025-09-07T07:50:44.9911989Z set-safe-directory: true 2025-09-07T07:50:44.9912180Z env: 2025-09-07T07:50:44.9912411Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:50:44.9912602Z ##[endgroup] 2025-09-07T07:50:45.0879346Z Syncing repository: pytorch/pytorch 2025-09-07T07:50:45.0880524Z ##[group]Getting Git version info 2025-09-07T07:50:45.0880863Z Working directory is '/home/david/_work/pytorch/pytorch' 2025-09-07T07:50:45.0881333Z [command]/usr/bin/git version 2025-09-07T07:50:45.0892299Z git version 2.50.1 2025-09-07T07:50:45.0916957Z ##[endgroup] 2025-09-07T07:50:45.0928150Z Temporarily overriding HOME='/home/david/_work/_temp/3292dfdf-7bd7-4ec5-8eba-1fecde67fb5e' before making global git config changes 2025-09-07T07:50:45.0928858Z Adding repository directory to the temporary git global config as a safe directory 2025-09-07T07:50:45.0934203Z [command]/usr/bin/git config --global --add safe.directory /home/david/_work/pytorch/pytorch 2025-09-07T07:50:45.1279891Z Deleting the contents of '/home/david/_work/pytorch/pytorch' 2025-09-07T07:50:45.1283164Z ##[group]Initializing the repository 2025-09-07T07:50:45.1286511Z [command]/usr/bin/git init /home/david/_work/pytorch/pytorch 2025-09-07T07:50:45.2218470Z hint: Using 'master' as the name for the initial branch. This default branch name 2025-09-07T07:50:45.2218960Z hint: is subject to change. To configure the initial branch name to use in all 2025-09-07T07:50:45.2219388Z hint: of your new repositories, which will suppress this warning, call: 2025-09-07T07:50:45.2219691Z hint: 2025-09-07T07:50:45.2219920Z hint: git config --global init.defaultBranch 2025-09-07T07:50:45.2220192Z hint: 2025-09-07T07:50:45.2220456Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2025-09-07T07:50:45.2220868Z hint: 'development'. The just-created branch can be renamed via this command: 2025-09-07T07:50:45.2221181Z hint: 2025-09-07T07:50:45.2221340Z hint: git branch -m 2025-09-07T07:50:45.2221614Z hint: 2025-09-07T07:50:45.2221878Z hint: Disable this message with "git config set advice.defaultBranchName false" 2025-09-07T07:50:45.2224525Z Initialized empty Git repository in /home/david/_work/pytorch/pytorch/.git/ 2025-09-07T07:50:45.2231140Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2025-09-07T07:50:45.3573487Z ##[endgroup] 2025-09-07T07:50:45.3574131Z ##[group]Disabling automatic garbage collection 2025-09-07T07:50:45.3576154Z [command]/usr/bin/git config --local gc.auto 0 2025-09-07T07:50:45.4728818Z ##[endgroup] 2025-09-07T07:50:45.4729765Z ##[group]Setting up auth 2025-09-07T07:50:45.4731787Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-09-07T07:50:45.4762188Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-09-07T07:50:45.5019278Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-09-07T07:50:45.5050134Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-09-07T07:50:45.5286403Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-09-07T07:50:45.6316662Z ##[endgroup] 2025-09-07T07:50:45.6317224Z ##[group]Fetching the repository 2025-09-07T07:50:45.6323926Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-09-07T07:51:35.7335666Z From https://github.com/pytorch/pytorch 2025-09-07T07:51:35.7336315Z * [new branch] 160583 -> origin/160583 2025-09-07T07:51:35.7340530Z * [new branch] 2.6.0.dev20241004+ -> origin/2.6.0.dev20241004+ 2025-09-07T07:51:35.7341169Z * [new branch] 5addvllmbuild -> origin/5addvllmbuild 2025-09-07T07:51:35.7342071Z * [new branch] AaronWang04_addmmfusion_perftest -> origin/AaronWang04_addmmfusion_perftest 2025-09-07T07:51:35.7342763Z * [new branch] HDCharles-2.6.0-release-notes -> origin/HDCharles-2.6.0-release-notes 2025-09-07T07:51:35.7343496Z * [new branch] ISSUE-154849 -> origin/ISSUE-154849 2025-09-07T07:51:35.7344477Z * [new branch] JackCaoG/dynamo_make_fx_non_core_aten_ops -> origin/JackCaoG/dynamo_make_fx_non_core_aten_ops 2025-09-07T07:51:35.7346217Z * [new branch] NicoshevSVE128 -> origin/NicoshevSVE128 2025-09-07T07:51:35.7347988Z * [new branch] PR-AOTInductorNoneBug -> origin/PR-AOTInductorNoneBug 2025-09-07T07:51:35.7349631Z * [new branch] PR-AOTInductorNoneBugFix -> origin/PR-AOTInductorNoneBugFix 2025-09-07T07:51:35.7351156Z * [new branch] PR-FixConfigsIssue -> origin/PR-FixConfigsIssue 2025-09-07T07:51:35.7352633Z * [new branch] PR-NoneBugFix-viable -> origin/PR-NoneBugFix-viable 2025-09-07T07:51:35.7354240Z * [new branch] PR-ResetToZero -> origin/PR-ResetToZero 2025-09-07T07:51:35.7356154Z * [new branch] Update-Flash-Packaging -> origin/Update-Flash-Packaging 2025-09-07T07:51:35.7357728Z * [new branch] VLA_exp -> origin/VLA_exp 2025-09-07T07:51:35.7359557Z * [new branch] actually-run-mps-aot-inductor -> origin/actually-run-mps-aot-inductor 2025-09-07T07:51:35.7361127Z * [new branch] add-missing-args-normalization -> origin/add-missing-args-normalization 2025-09-07T07:51:35.7362760Z * [new branch] add-user-guide-structure -> origin/add-user-guide-structure 2025-09-07T07:51:35.7364429Z * [new branch] add-vllm-nightly-build -> origin/add-vllm-nightly-build 2025-09-07T07:51:35.7366380Z * [new branch] add_compile_benchmarking -> origin/add_compile_benchmarking 2025-09-07T07:51:35.7368055Z * [new branch] addmm-heuristic -> origin/addmm-heuristic 2025-09-07T07:51:35.7369693Z * [new branch] addsimde -> origin/addsimde 2025-09-07T07:51:35.7371642Z * [new branch] addvllmtest -> origin/addvllmtest 2025-09-07T07:51:35.7374024Z * [new branch] adi/acl_upgrade -> origin/adi/acl_upgrade 2025-09-07T07:51:35.7375452Z * [new branch] adi/test -> origin/adi/test 2025-09-07T07:51:35.7377153Z * [new branch] adi/test_bgemm -> origin/adi/test_bgemm 2025-09-07T07:51:35.7378730Z * [new branch] adi/test_fusions -> origin/adi/test_fusions 2025-09-07T07:51:35.7380352Z * [new branch] adi/test_onednn_v3.9 -> origin/adi/test_onednn_v3.9 2025-09-07T07:51:35.7382203Z * [new branch] adi/test_presve_change -> origin/adi/test_presve_change 2025-09-07T07:51:35.7383708Z * [new branch] adi/test_timm -> origin/adi/test_timm 2025-09-07T07:51:35.7386043Z * [new branch] adi/testpresve_change -> origin/adi/testpresve_change 2025-09-07T07:51:35.7388540Z * [new branch] aditew01/test/vec_bf16 -> origin/aditew01/test/vec_bf16 2025-09-07T07:51:35.7390179Z * [new branch] ah-globalfeedback-hook -> origin/ah-globalfeedback-hook 2025-09-07T07:51:35.7391779Z * [new branch] alt-disable -> origin/alt-disable 2025-09-07T07:51:35.7394159Z * [new branch] angelayi/aoti_additional_files -> origin/angelayi/aoti_additional_files 2025-09-07T07:51:35.7396023Z * [new branch] angelayi/aoti_inductor_fx -> origin/angelayi/aoti_inductor_fx 2025-09-07T07:51:35.7397502Z * [new branch] angelayi/benchmark -> origin/angelayi/benchmark 2025-09-07T07:51:35.7399304Z * [new branch] angelayi/benchmark2 -> origin/angelayi/benchmark2 2025-09-07T07:51:35.7400922Z * [new branch] angelayi/change_pytree_serialization -> origin/angelayi/change_pytree_serialization 2025-09-07T07:51:35.7402510Z * [new branch] angelayi/cpp_loader -> origin/angelayi/cpp_loader 2025-09-07T07:51:35.7404164Z * [new branch] angelayi/custom_op_subgraph -> origin/angelayi/custom_op_subgraph 2025-09-07T07:51:35.7405963Z * [new branch] angelayi/customop -> origin/angelayi/customop 2025-09-07T07:51:35.7407540Z * [new branch] angelayi/fake_cache_empty -> origin/angelayi/fake_cache_empty 2025-09-07T07:51:35.7409147Z * [new branch] angelayi/is_symbolic_tracing -> origin/angelayi/is_symbolic_tracing 2025-09-07T07:51:35.7410644Z * [new branch] angelayi/item -> origin/angelayi/item 2025-09-07T07:51:35.7412160Z * [new branch] angelayi/no_so_weight -> origin/angelayi/no_so_weight 2025-09-07T07:51:35.7413764Z * [new branch] angelayi/opoverload -> origin/angelayi/opoverload 2025-09-07T07:51:35.7415551Z * [new branch] angelayi/pattern -> origin/angelayi/pattern 2025-09-07T07:51:35.7417313Z * [new branch] angelayi/pytree -> origin/angelayi/pytree 2025-09-07T07:51:35.7419153Z * [new branch] angelayi/scan_layers -> origin/angelayi/scan_layers 2025-09-07T07:51:35.7420718Z * [new branch] angelayi/symint_input -> origin/angelayi/symint_input 2025-09-07T07:51:35.7422465Z * [new branch] angelayi/test_cpp -> origin/angelayi/test_cpp 2025-09-07T07:51:35.7424138Z * [new branch] angelayi/torch_size -> origin/angelayi/torch_size 2025-09-07T07:51:35.7426020Z * [new branch] aoti-cuda-alloc -> origin/aoti-cuda-alloc 2025-09-07T07:51:35.7427622Z * [new branch] aoti_target_windows -> origin/aoti_target_windows 2025-09-07T07:51:35.7429272Z * [new branch] aoti_weight_sharing -> origin/aoti_weight_sharing 2025-09-07T07:51:35.7431077Z * [new branch] atalman-inductor-perf-cu124 -> origin/atalman-inductor-perf-cu124 2025-09-07T07:51:35.7432880Z * [new branch] atalman-inductor-perf-cu124.1 -> origin/atalman-inductor-perf-cu124.1 2025-09-07T07:51:35.7434273Z * [new branch] atalman-patch-1 -> origin/atalman-patch-1 2025-09-07T07:51:35.7436287Z * [new branch] atalman-patch-3 -> origin/atalman-patch-3 2025-09-07T07:51:35.7437899Z * [new branch] atalman-patch-4 -> origin/atalman-patch-4 2025-09-07T07:51:35.7439583Z * [new branch] atalman-patch-5 -> origin/atalman-patch-5 2025-09-07T07:51:35.7441172Z * [new branch] atalman-patch-6 -> origin/atalman-patch-6 2025-09-07T07:51:35.7442878Z * [new branch] atalman_inductor_2.3.0 -> origin/atalman_inductor_2.3.0 2025-09-07T07:51:35.7444667Z * [new branch] atalman_inductor_2.3.1 -> origin/atalman_inductor_2.3.1 2025-09-07T07:51:35.7446619Z * [new branch] atalman_inductor_2.4.0 -> origin/atalman_inductor_2.4.0 2025-09-07T07:51:35.7448327Z * [new branch] atalman_inductor_2.4.x -> origin/atalman_inductor_2.4.x 2025-09-07T07:51:35.7450069Z * [new branch] autoupdate-transformers-pin-via-pr -> origin/autoupdate-transformers-pin-via-pr 2025-09-07T07:51:35.7452240Z * [new branch] bahuang/dtensor_demo -> origin/bahuang/dtensor_demo 2025-09-07T07:51:35.7453864Z * [new branch] bahuang/test -> origin/bahuang/test 2025-09-07T07:51:35.7456604Z * [new branch] base/1.5 -> origin/base/1.5 2025-09-07T07:51:35.7458303Z * [new branch] batching_sdpa_efficient_attention -> origin/batching_sdpa_efficient_attention 2025-09-07T07:51:35.7459861Z * [new branch] bc-lint-config -> origin/bc-lint-config 2025-09-07T07:51:35.7461577Z * [new branch] bc-lint-test-new-config -> origin/bc-lint-test-new-config 2025-09-07T07:51:35.7463436Z * [new branch] benchmark-updates -> origin/benchmark-updates 2025-09-07T07:51:35.7465269Z * [new branch] benchmarker_compat_with_do_bench -> origin/benchmarker_compat_with_do_bench 2025-09-07T07:51:35.7467000Z * [new branch] benchmarking-script -> origin/benchmarking-script 2025-09-07T07:51:35.7469383Z * [new branch] bertmaher/pinbump26 -> origin/bertmaher/pinbump26 2025-09-07T07:51:35.7471671Z * [new branch] bertrand/cutlass -> origin/bertrand/cutlass 2025-09-07T07:51:35.7473839Z * [new branch] bf/cg-custom-wrapper -> origin/bf/cg-custom-wrapper 2025-09-07T07:51:35.7475667Z * [new branch] bf/cg-or-error -> origin/bf/cg-or-error 2025-09-07T07:51:35.7477165Z * [new branch] bf/cg-remove-check -> origin/bf/cg-remove-check 2025-09-07T07:51:35.7478719Z * [new branch] bf/cg-skip-1-kernel -> origin/bf/cg-skip-1-kernel 2025-09-07T07:51:35.7480280Z * [new branch] bf/cudagraph -> origin/bf/cudagraph 2025-09-07T07:51:35.7481904Z * [new branch] bf/cudagraph-disable-input-mutation -> origin/bf/cudagraph-disable-input-mutation 2025-09-07T07:51:35.7483661Z * [new branch] bf/cudagraph-enable-input-mutation-support-benchmark -> origin/bf/cudagraph-enable-input-mutation-support-benchmark 2025-09-07T07:51:35.7485086Z * [new branch] bf/cudagraph-partition -> origin/bf/cudagraph-partition 2025-09-07T07:51:35.7486996Z * [new branch] bf/default-recompile-reason -> origin/bf/default-recompile-reason 2025-09-07T07:51:35.7488574Z * [new branch] bf/donated-buffer-bench -> origin/bf/donated-buffer-bench 2025-09-07T07:51:35.7490077Z * [new branch] bf/exp -> origin/bf/exp 2025-09-07T07:51:35.7491609Z * [new branch] bf/pa-non-divisible -> origin/bf/pa-non-divisible 2025-09-07T07:51:35.7493458Z * [new branch] bf/partition-move-cpu -> origin/bf/partition-move-cpu 2025-09-07T07:51:35.7495349Z * [new branch] bf/partition-turn-on -> origin/bf/partition-turn-on 2025-09-07T07:51:35.7496650Z * [new branch] bf/remove-check-55b0c39d -> origin/bf/remove-check-55b0c39d 2025-09-07T07:51:35.7498076Z * [new branch] bf/rope -> origin/bf/rope 2025-09-07T07:51:35.7499773Z * [new branch] bisect_perf_hf_T5_3acc6eac492 -> origin/bisect_perf_hf_T5_3acc6eac492 2025-09-07T07:51:35.7501393Z * [new branch] bisect_perf_hf_T5_3fcf66f61fb -> origin/bisect_perf_hf_T5_3fcf66f61fb 2025-09-07T07:51:35.7503086Z * [new branch] bisect_perf_hf_T5_4009d154129 -> origin/bisect_perf_hf_T5_4009d154129 2025-09-07T07:51:35.7504693Z * [new branch] bisect_perf_hf_T5_40d0740e73d -> origin/bisect_perf_hf_T5_40d0740e73d 2025-09-07T07:51:35.7506804Z * [new branch] bisect_perf_hf_T5_5268754e -> origin/bisect_perf_hf_T5_5268754e 2025-09-07T07:51:35.7508446Z * [new branch] bisect_perf_hf_T5_7d89a8d385c -> origin/bisect_perf_hf_T5_7d89a8d385c 2025-09-07T07:51:35.7510198Z * [new branch] bisect_perf_hf_T5_b7a25c1ee7c -> origin/bisect_perf_hf_T5_b7a25c1ee7c 2025-09-07T07:51:35.7511841Z * [new branch] bisect_perf_hf_T5_c25b201583f -> origin/bisect_perf_hf_T5_c25b201583f 2025-09-07T07:51:35.7513526Z * [new branch] bisect_perf_hf_T5_c93e57efac0 -> origin/bisect_perf_hf_T5_c93e57efac0 2025-09-07T07:51:35.7515266Z * [new branch] bisect_perf_hf_T5_ca9813ea149 -> origin/bisect_perf_hf_T5_ca9813ea149 2025-09-07T07:51:35.7517000Z * [new branch] bisect_perf_hf_T5_d65f194a -> origin/bisect_perf_hf_T5_d65f194a 2025-09-07T07:51:35.7518612Z * [new branch] bisect_perf_hf_T5_da94ab0b -> origin/bisect_perf_hf_T5_da94ab0b 2025-09-07T07:51:35.7520157Z * [new branch] bisect_perf_hf_T5_da94ab0b_new -> origin/bisect_perf_hf_T5_da94ab0b_new 2025-09-07T07:51:35.7521838Z * [new branch] bisect_perf_hf_T5_db4e8a1d8a8 -> origin/bisect_perf_hf_T5_db4e8a1d8a8 2025-09-07T07:51:35.7523423Z * [new branch] bisect_perf_hf_T5_e0d97e936a2 -> origin/bisect_perf_hf_T5_e0d97e936a2 2025-09-07T07:51:35.7525276Z * [new branch] bisect_perf_hf_T5_f23621ec563 -> origin/bisect_perf_hf_T5_f23621ec563 2025-09-07T07:51:35.7527742Z * [new branch] bowbao/bench_updates_stage -> origin/bowbao/bench_updates_stage 2025-09-07T07:51:35.7529283Z * [new branch] bowbao/dort_rewriter -> origin/bowbao/dort_rewriter 2025-09-07T07:51:35.7530675Z * [new branch] bowbao/wip_prs -> origin/bowbao/wip_prs 2025-09-07T07:51:35.7533071Z * [new branch] brister/break_tensorbox -> origin/brister/break_tensorbox 2025-09-07T07:51:35.7534691Z * [new branch] brister/custom_fx_backend -> origin/brister/custom_fx_backend 2025-09-07T07:51:35.7536592Z * [new branch] brister/fx_custom_triton -> origin/brister/fx_custom_triton 2025-09-07T07:51:35.7538071Z * [new branch] brister/tensor_box_output -> origin/brister/tensor_box_output 2025-09-07T07:51:35.7539716Z * [new branch] brister/tiled_reduction_no_numel_check -> origin/brister/tiled_reduction_no_numel_check 2025-09-07T07:51:35.7541248Z * [new branch] c57382a49 -> origin/c57382a49 2025-09-07T07:51:35.7543086Z * [new branch] ca_0431d47eaa -> origin/ca_0431d47eaa 2025-09-07T07:51:35.7544805Z * [new branch] ca_fix_0431d47eaa -> origin/ca_fix_0431d47eaa 2025-09-07T07:51:35.7548107Z * [new branch] camyll/revert-94bc900da97ad7f3c35b3b819bb53b23c74b581a-for-release-2.8 -> origin/camyll/revert-94bc900da97ad7f3c35b3b819bb53b23c74b581a-for-release-2.8 2025-09-07T07:51:35.7550095Z * [new branch] camyllh/test_setup_hooks_push -> origin/camyllh/test_setup_hooks_push 2025-09-07T07:51:35.7552257Z * [new branch] cherry-pick-149654-by-pytorch_bot_bot_ -> origin/cherry-pick-149654-by-pytorch_bot_bot_ 2025-09-07T07:51:35.7553902Z * [new branch] cherry-pick-151939-by-pytorch_bot_bot_ -> origin/cherry-pick-151939-by-pytorch_bot_bot_ 2025-09-07T07:51:35.7555907Z * [new branch] cherry-pick-154174-by-pytorch_bot_bot_ -> origin/cherry-pick-154174-by-pytorch_bot_bot_ 2025-09-07T07:51:35.7557681Z * [new branch] cherry-pick-156260-by-pytorch_bot_bot_ -> origin/cherry-pick-156260-by-pytorch_bot_bot_ 2025-09-07T07:51:35.7559398Z * [new branch] cherry-pick-157453-by-pytorch_bot_bot_ -> origin/cherry-pick-157453-by-pytorch_bot_bot_ 2025-09-07T07:51:35.7561242Z * [new branch] cherry-pick-157513-by-pytorch_bot_bot_ -> origin/cherry-pick-157513-by-pytorch_bot_bot_ 2025-09-07T07:51:35.7562930Z * [new branch] cherry-pick-157695-by-pytorch_bot_bot_ -> origin/cherry-pick-157695-by-pytorch_bot_bot_ 2025-09-07T07:51:35.7564671Z * [new branch] cherry-pick-157732-by-pytorch_bot_bot_ -> origin/cherry-pick-157732-by-pytorch_bot_bot_ 2025-09-07T07:51:35.7566697Z * [new branch] cherry-pick-158537-by-pytorch_bot_bot_ -> origin/cherry-pick-158537-by-pytorch_bot_bot_ 2025-09-07T07:51:35.7568401Z * [new branch] cherry-pick-159969-by-pytorch_bot_bot_ -> origin/cherry-pick-159969-by-pytorch_bot_bot_ 2025-09-07T07:51:35.7570144Z * [new branch] cherry-pick-160586-by-pytorch_bot_bot_ -> origin/cherry-pick-160586-by-pytorch_bot_bot_ 2025-09-07T07:51:35.7572484Z * [new branch] chilli/flex_vllm -> origin/chilli/flex_vllm 2025-09-07T07:51:35.7574377Z * [new branch] cleanup-inductor-benchmark-images -> origin/cleanup-inductor-benchmark-images 2025-09-07T07:51:35.7576470Z * [new branch] codex-testing -> origin/codex-testing 2025-09-07T07:51:35.7579008Z * [new branch] codex/add-helper-function-to-sizevars.py -> origin/codex/add-helper-function-to-sizevars.py 2025-09-07T07:51:35.7580599Z * [new branch] codex/add-helper-function-to-sizevars.py_2025-09-05 -> origin/codex/add-helper-function-to-sizevars.py_2025-09-05 2025-09-07T07:51:35.7582214Z * [new branch] codex/add-metadata-field-for-file-path -> origin/codex/add-metadata-field-for-file-path 2025-09-07T07:51:35.7583919Z * [new branch] codex/add-test-for-inductor-local-cache-behavior -> origin/codex/add-test-for-inductor-local-cache-behavior 2025-09-07T07:51:35.7585848Z * [new branch] codex/create-test-for-tensor-memory-leak-in-cudagraph -> origin/codex/create-test-for-tensor-memory-leak-in-cudagraph 2025-09-07T07:51:35.7587337Z * [new branch] codex/fix-issue-121219-in-pytorch -> origin/codex/fix-issue-121219-in-pytorch 2025-09-07T07:51:35.7588803Z * [new branch] codex/fix-issue-160415-in-pytorch -> origin/codex/fix-issue-160415-in-pytorch 2025-09-07T07:51:35.7590464Z * [new branch] codex/fix-noqengine-quantized-engine-support -> origin/codex/fix-noqengine-quantized-engine-support 2025-09-07T07:51:35.7592230Z * [new branch] codex/fix-pin_memory-error-handling -> origin/codex/fix-pin_memory-error-handling 2025-09-07T07:51:35.7593331Z * [new branch] codex/propose-fix-for-issue-160332 -> origin/codex/propose-fix-for-issue-160332 2025-09-07T07:51:35.7595203Z * [new branch] codex/refactor-lintrunner-config-to-use-uv-run -> origin/codex/refactor-lintrunner-config-to-use-uv-run 2025-09-07T07:51:35.7597173Z * [new branch] codex/remove-allow-untyped-defs-and-fix-type-errors -> origin/codex/remove-allow-untyped-defs-and-fix-type-errors 2025-09-07T07:51:35.7598773Z * [new branch] compile_fsdp2_disable_stream_and_event -> origin/compile_fsdp2_disable_stream_and_event 2025-09-07T07:51:35.7600349Z * [new branch] context_test -> origin/context_test 2025-09-07T07:51:35.7602891Z * [new branch] copilot/fix-157446 -> origin/copilot/fix-157446 2025-09-07T07:51:35.7604415Z * [new branch] copy_graph -> origin/copy_graph 2025-09-07T07:51:35.7607059Z * [new branch] cpio/fix_new_ami_tests -> origin/cpio/fix_new_ami_tests 2025-09-07T07:51:35.7609312Z * [new branch] csl/always_produce_xml -> origin/csl/always_produce_xml 2025-09-07T07:51:35.7610868Z * [new branch] csl/build_test_more_procs -> origin/csl/build_test_more_procs 2025-09-07T07:51:35.7612470Z * [new branch] csl/build_test_more_procs2 -> origin/csl/build_test_more_procs2 2025-09-07T07:51:35.7645324Z * [new branch] csl/disable_flaky_cpp_test -> origin/csl/disable_flaky_cpp_test 2025-09-07T07:51:35.7645869Z * [new branch] csl/disable_periodic_test -> origin/csl/disable_periodic_test 2025-09-07T07:51:35.7646385Z * [new branch] csl/exclude_rocm_viable_strict -> origin/csl/exclude_rocm_viable_strict 2025-09-07T07:51:35.7646865Z * [new branch] csl/katex -> origin/csl/katex 2025-09-07T07:51:35.7647311Z * [new branch] csl/larger_runner -> origin/csl/larger_runner 2025-09-07T07:51:35.7647771Z * [new branch] csl/lintrunner_stuff -> origin/csl/lintrunner_stuff 2025-09-07T07:51:35.7648185Z * [new branch] csl/mps_sharding -> origin/csl/mps_sharding 2025-09-07T07:51:35.7648930Z * [new branch] csl/multistage_docker -> origin/csl/multistage_docker 2025-09-07T07:51:35.7649419Z * [new branch] csl/name_link_check_job -> origin/csl/name_link_check_job 2025-09-07T07:51:35.7649873Z * [new branch] csl/no_keep_goin_rocm -> origin/csl/no_keep_goin_rocm 2025-09-07T07:51:35.7650320Z * [new branch] csl/not_600_timeout -> origin/csl/not_600_timeout 2025-09-07T07:51:35.7650732Z * [new branch] csl/revert_open -> origin/csl/revert_open 2025-09-07T07:51:35.7651167Z * [new branch] csl/skip_build -> origin/csl/skip_build 2025-09-07T07:51:35.7651663Z * [new branch] csl/test_cuda_build_large_runner -> origin/csl/test_cuda_build_large_runner 2025-09-07T07:51:35.7652154Z * [new branch] csl/win_sccache -> origin/csl/win_sccache 2025-09-07T07:51:35.7652597Z * [new branch] cublasltrelax2 -> origin/cublasltrelax2 2025-09-07T07:51:35.7653015Z * [new branch] cublasrelax2 -> origin/cublasrelax2 2025-09-07T07:51:35.7653464Z * [new branch] cudnnsdparefactor -> origin/cudnnsdparefactor 2025-09-07T07:51:35.7653940Z * [new branch] custom_lowering_dict -> origin/custom_lowering_dict 2025-09-07T07:51:35.7654393Z * [new branch] czhuge_muon_dev -> origin/czhuge_muon_dev 2025-09-07T07:51:35.7654829Z * [new branch] d4l3k/delete_hook -> origin/d4l3k/delete_hook 2025-09-07T07:51:35.7655407Z * [new branch] dcp_zoc -> origin/dcp_zoc 2025-09-07T07:51:35.7655815Z * [new branch] debug-guard -> origin/debug-guard 2025-09-07T07:51:35.7656231Z * [new branch] delete-quant-docs -> origin/delete-quant-docs 2025-09-07T07:51:35.7659682Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.2 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.2 2025-09-07T07:51:35.7661282Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.3 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.3 2025-09-07T07:51:35.7663278Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.4 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.55.4 2025-09-07T07:51:35.7665578Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.56.0 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.56.0 2025-09-07T07:51:35.7666824Z * [new branch] dependabot/pip/dot-ci/docker/protobuf-5.29.5 -> origin/dependabot/pip/dot-ci/docker/protobuf-5.29.5 2025-09-07T07:51:35.7669895Z * [new branch] dependabot/pip/dot-github/requirements/protobuf-5.29.5 -> origin/dependabot/pip/dot-github/requirements/protobuf-5.29.5 2025-09-07T07:51:35.7672028Z * [new branch] desertfire/test_cpp_wrapper -> origin/desertfire/test_cpp_wrapper 2025-09-07T07:51:35.7673663Z * [new branch] desertfire/triton-cpu-for-aarch64 -> origin/desertfire/triton-cpu-for-aarch64 2025-09-07T07:51:35.7677251Z * [new branch] dev/joona/MPSNDArrayAdd -> origin/dev/joona/MPSNDArrayAdd 2025-09-07T07:51:35.7678913Z * [new branch] dev/joona/Unranked -> origin/dev/joona/Unranked 2025-09-07T07:51:35.7680802Z * [new branch] dev/joona/cat -> origin/dev/joona/cat 2025-09-07T07:51:35.7682557Z * [new branch] dev/joona/cat_remove_graph -> origin/dev/joona/cat_remove_graph 2025-09-07T07:51:35.7684063Z * [new branch] dev/joona/embeddingbag -> origin/dev/joona/embeddingbag 2025-09-07T07:51:35.7686159Z * [new branch] dev/joona/getTensorsString -> origin/dev/joona/getTensorsString 2025-09-07T07:51:35.7687938Z * [new branch] dev/joona/maxpool2dwithindices_errmsg -> origin/dev/joona/maxpool2dwithindices_errmsg 2025-09-07T07:51:35.7689651Z * [new branch] dev/joona/mps_linear_macos14 -> origin/dev/joona/mps_linear_macos14 2025-09-07T07:51:35.7691395Z * [new branch] dev/joona/sdpa -> origin/dev/joona/sdpa 2025-09-07T07:51:35.7693522Z * [new branch] dev/joona/topk_newapi -> origin/dev/joona/topk_newapi 2025-09-07T07:51:35.7695360Z * [new branch] dev/joona/type_inf -> origin/dev/joona/type_inf 2025-09-07T07:51:35.7697125Z * [new branch] dev/joona/upsize3d -> origin/dev/joona/upsize3d 2025-09-07T07:51:35.7698995Z * [new branch] disable -> origin/disable 2025-09-07T07:51:35.7700819Z * [new branch] e2e-baseline -> origin/e2e-baseline 2025-09-07T07:51:35.7702796Z * [new branch] eigen_for_sparse_addmm_v2 -> origin/eigen_for_sparse_addmm_v2 2025-09-07T07:51:35.7705440Z * [new branch] embg/test_inductor_ci_128B -> origin/embg/test_inductor_ci_128B 2025-09-07T07:51:35.7707511Z * [new branch] embg/test_inductor_ci_base -> origin/embg/test_inductor_ci_base 2025-09-07T07:51:35.7709063Z * [new branch] embg/test_inductor_ci_control -> origin/embg/test_inductor_ci_control 2025-09-07T07:51:35.7710783Z * [new branch] embg/triton_l2_prefetch_128B -> origin/embg/triton_l2_prefetch_128B 2025-09-07T07:51:35.7712039Z * [new branch] embg/triton_l2_prefetch_256B -> origin/embg/triton_l2_prefetch_256B 2025-09-07T07:51:35.7713897Z * [new branch] eqy-patch-1 -> origin/eqy-patch-1 2025-09-07T07:51:35.7716012Z * [new branch] eqy-patch-2 -> origin/eqy-patch-2 2025-09-07T07:51:35.7717734Z * [new branch] eqy-patch-3 -> origin/eqy-patch-3 2025-09-07T07:51:35.7719556Z * [new branch] eqy-patch-4 -> origin/eqy-patch-4 2025-09-07T07:51:35.7721641Z * [new branch] example-convert-torch.nn -> origin/example-convert-torch.nn 2025-09-07T07:51:35.7724165Z * [new branch] exclamaforte/add-contiguous-threshold -> origin/exclamaforte/add-contiguous-threshold 2025-09-07T07:51:35.7725872Z * [new branch] exclamaforte/amd-ma -> origin/exclamaforte/amd-ma 2025-09-07T07:51:35.7727455Z * [new branch] exclamaforte/bump-transformer-version -> origin/exclamaforte/bump-transformer-version 2025-09-07T07:51:35.7729233Z * [new branch] exclamaforte/clear-feedback-savers -> origin/exclamaforte/clear-feedback-savers 2025-09-07T07:51:35.7730611Z * [new branch] exclamaforte/combo-kernels-perf-run -> origin/exclamaforte/combo-kernels-perf-run 2025-09-07T07:51:35.7732081Z * [new branch] exclamaforte/do_bench_refactor -> origin/exclamaforte/do_bench_refactor 2025-09-07T07:51:35.7733699Z * [new branch] exclamaforte/enable-mem-dep-fusion -> origin/exclamaforte/enable-mem-dep-fusion 2025-09-07T07:51:35.7735424Z * [new branch] exclamaforte/fix-exhaustive-autotuning -> origin/exclamaforte/fix-exhaustive-autotuning 2025-09-07T07:51:35.7737166Z * [new branch] exclamaforte/fix-exhuastive-autotuning-reland -> origin/exclamaforte/fix-exhuastive-autotuning-reland 2025-09-07T07:51:35.7738679Z * [new branch] exclamaforte/fix-trace-parsing-fx-svg -> origin/exclamaforte/fix-trace-parsing-fx-svg 2025-09-07T07:51:35.7740197Z * [new branch] exclamaforte/force-pointwise-cat-perf-run -> origin/exclamaforte/force-pointwise-cat-perf-run 2025-09-07T07:51:35.7741731Z * [new branch] exclamaforte/fusion-data -> origin/exclamaforte/fusion-data 2025-09-07T07:51:35.7743521Z * [new branch] exclamaforte/gemm-benchmark-run -> origin/exclamaforte/gemm-benchmark-run 2025-09-07T07:51:35.7745218Z * [new branch] exclamaforte/gemm-export-model -> origin/exclamaforte/gemm-export-model 2025-09-07T07:51:35.7746843Z * [new branch] exclamaforte/gemm-model -> origin/exclamaforte/gemm-model 2025-09-07T07:51:35.7748499Z * [new branch] exclamaforte/gemm-model-all-data-collection -> origin/exclamaforte/gemm-model-all-data-collection 2025-09-07T07:51:35.7749888Z * [new branch] exclamaforte/gemm-to-amd -> origin/exclamaforte/gemm-to-amd 2025-09-07T07:51:35.7751518Z * [new branch] exclamaforte/just-gemm-model -> origin/exclamaforte/just-gemm-model 2025-09-07T07:51:35.7753079Z * [new branch] exclamaforte/just-gemm-model-no-refactor -> origin/exclamaforte/just-gemm-model-no-refactor 2025-09-07T07:51:35.7754555Z * [new branch] exclamaforte/max-autotune-ieee -> origin/exclamaforte/max-autotune-ieee 2025-09-07T07:51:35.7756416Z * [new branch] exclamaforte/memory-counter -> origin/exclamaforte/memory-counter 2025-09-07T07:51:35.7757953Z * [new branch] exclamaforte/profile-diff-algo -> origin/exclamaforte/profile-diff-algo 2025-09-07T07:51:35.7759518Z * [new branch] exclamaforte/profiler-combo -> origin/exclamaforte/profiler-combo 2025-09-07T07:51:35.7761092Z * [new branch] exclamaforte/test_cpp_wrapper_mode -> origin/exclamaforte/test_cpp_wrapper_mode 2025-09-07T07:51:35.7762771Z * [new branch] exclamaforte/update-autotune-configs -> origin/exclamaforte/update-autotune-configs 2025-09-07T07:51:35.7764376Z * [new branch] exclamaforte/update-autotune-configs-2 -> origin/exclamaforte/update-autotune-configs-2 2025-09-07T07:51:35.7767122Z * [new branch] exclamforte/gemm-model-final -> origin/exclamforte/gemm-model-final 2025-09-07T07:51:35.7768819Z * [new branch] exec -> origin/exec 2025-09-07T07:51:35.7770725Z * [new branch] executorch-module-shim -> origin/executorch-module-shim 2025-09-07T07:51:35.7772629Z * [new branch] experimental-mosaic -> origin/experimental-mosaic 2025-09-07T07:51:35.7774497Z * [new branch] export-D58091437 -> origin/export-D58091437 2025-09-07T07:51:35.7776724Z * [new branch] export-D61047529 -> origin/export-D61047529 2025-09-07T07:51:35.7778512Z * [new branch] export-D70112642 -> origin/export-D70112642 2025-09-07T07:51:35.7780474Z * [new branch] export-D71412006 -> origin/export-D71412006 2025-09-07T07:51:35.7783056Z * [new branch] export-D73042989 -> origin/export-D73042989 2025-09-07T07:51:35.7784770Z * [new branch] export-D75183591 -> origin/export-D75183591 2025-09-07T07:51:35.7787125Z * [new branch] export-D75617432 -> origin/export-D75617432 2025-09-07T07:51:35.7788901Z * [new branch] export-D75659965 -> origin/export-D75659965 2025-09-07T07:51:35.7790779Z * [new branch] export-D76080931 -> origin/export-D76080931 2025-09-07T07:51:35.7792562Z * [new branch] export-D76797250 -> origin/export-D76797250 2025-09-07T07:51:35.7794388Z * [new branch] export-D76885271 -> origin/export-D76885271 2025-09-07T07:51:35.7796529Z * [new branch] export-D76885620 -> origin/export-D76885620 2025-09-07T07:51:35.7798381Z * [new branch] export-D76936623 -> origin/export-D76936623 2025-09-07T07:51:35.7800289Z * [new branch] export-D76958268 -> origin/export-D76958268 2025-09-07T07:51:35.7802103Z * [new branch] export-D78375400 -> origin/export-D78375400 2025-09-07T07:51:35.7804071Z * [new branch] export-D78431305 -> origin/export-D78431305 2025-09-07T07:51:35.7806264Z * [new branch] export-D78580107 -> origin/export-D78580107 2025-09-07T07:51:35.7808164Z * [new branch] export-D78822171 -> origin/export-D78822171 2025-09-07T07:51:35.7810471Z * [new branch] export-D78822351 -> origin/export-D78822351 2025-09-07T07:51:35.7812362Z * [new branch] export-D78822507 -> origin/export-D78822507 2025-09-07T07:51:35.7814182Z * [new branch] export-D78826994 -> origin/export-D78826994 2025-09-07T07:51:35.7816786Z * [new branch] export-D78894324 -> origin/export-D78894324 2025-09-07T07:51:35.7818552Z * [new branch] export-D78929245 -> origin/export-D78929245 2025-09-07T07:51:35.7820251Z * [new branch] export-D78934925 -> origin/export-D78934925 2025-09-07T07:51:35.7822113Z * [new branch] export-D78953203 -> origin/export-D78953203 2025-09-07T07:51:35.7823857Z * [new branch] export-D78953229 -> origin/export-D78953229 2025-09-07T07:51:35.7825707Z * [new branch] export-D78957093 -> origin/export-D78957093 2025-09-07T07:51:35.7827535Z * [new branch] export-D78957389 -> origin/export-D78957389 2025-09-07T07:51:35.7829164Z * [new branch] export-D78996107 -> origin/export-D78996107 2025-09-07T07:51:35.7830826Z * [new branch] export-D79026433 -> origin/export-D79026433 2025-09-07T07:51:35.7832627Z * [new branch] export-D79230339 -> origin/export-D79230339 2025-09-07T07:51:35.7834324Z * [new branch] export-D79319835 -> origin/export-D79319835 2025-09-07T07:51:35.7836347Z * [new branch] export-D79328456 -> origin/export-D79328456 2025-09-07T07:51:35.7838152Z * [new branch] export-D79534608 -> origin/export-D79534608 2025-09-07T07:51:35.7840089Z * [new branch] export-D79785974 -> origin/export-D79785974 2025-09-07T07:51:35.7841949Z * [new branch] export-D80025417 -> origin/export-D80025417 2025-09-07T07:51:35.7843688Z * [new branch] export-D80120333 -> origin/export-D80120333 2025-09-07T07:51:35.7845719Z * [new branch] export-D80214882 -> origin/export-D80214882 2025-09-07T07:51:35.7847542Z * [new branch] export-D80319069 -> origin/export-D80319069 2025-09-07T07:51:35.7849375Z * [new branch] export-D80321215 -> origin/export-D80321215 2025-09-07T07:51:35.7851308Z * [new branch] export-D80503451 -> origin/export-D80503451 2025-09-07T07:51:35.7852749Z * [new branch] export-D80771648 -> origin/export-D80771648 2025-09-07T07:51:35.7854498Z * [new branch] export-D80823877 -> origin/export-D80823877 2025-09-07T07:51:35.7856705Z * [new branch] export-D80948073 -> origin/export-D80948073 2025-09-07T07:51:35.7858513Z * [new branch] export-D80958642 -> origin/export-D80958642 2025-09-07T07:51:35.7860285Z * [new branch] export-D80970483 -> origin/export-D80970483 2025-09-07T07:51:35.7862256Z * [new branch] export-D81054193 -> origin/export-D81054193 2025-09-07T07:51:35.7864070Z * [new branch] export-D81060182 -> origin/export-D81060182 2025-09-07T07:51:35.7866185Z * [new branch] export-D81078973 -> origin/export-D81078973 2025-09-07T07:51:35.7867977Z * [new branch] export-D81204584 -> origin/export-D81204584 2025-09-07T07:51:35.7869771Z * [new branch] export-D81284190 -> origin/export-D81284190 2025-09-07T07:51:35.7871517Z * [new branch] export-D81299840 -> origin/export-D81299840 2025-09-07T07:51:35.7873239Z * [new branch] export-D81429090 -> origin/export-D81429090 2025-09-07T07:51:35.7875200Z * [new branch] export-D81698719 -> origin/export-D81698719 2025-09-07T07:51:35.7877305Z * [new branch] export-D81747409 -> origin/export-D81747409 2025-09-07T07:51:35.7879237Z * [new branch] exported-model-train-idempotent -> origin/exported-model-train-idempotent 2025-09-07T07:51:35.7881522Z * [new branch] ezyang/wip-aot-descriptors -> origin/ezyang/wip-aot-descriptors 2025-09-07T07:51:35.7883302Z * [new branch] fa_u8_brgemm -> origin/fa_u8_brgemm 2025-09-07T07:51:35.7885165Z * [new branch] fastmath_baseline -> origin/fastmath_baseline 2025-09-07T07:51:35.7887721Z * [new branch] fbcode/warm -> origin/fbcode/warm 2025-09-07T07:51:35.7889728Z * [new branch] fca -> origin/fca 2025-09-07T07:51:35.7891490Z * [new branch] fca2_ca5984c -> origin/fca2_ca5984c 2025-09-07T07:51:35.7893233Z * [new branch] fca5 -> origin/fca5 2025-09-07T07:51:35.7895992Z * [new branch] feature/function-numa-binding -> origin/feature/function-numa-binding 2025-09-07T07:51:35.7897598Z * [new branch] feature/function-numa-binding-take2 -> origin/feature/function-numa-binding-take2 2025-09-07T07:51:35.7899160Z * [new branch] feature/numa-nproc-fix -> origin/feature/numa-nproc-fix 2025-09-07T07:51:35.7900577Z * [new branch] feature/numa-signpost-serialize -> origin/feature/numa-signpost-serialize 2025-09-07T07:51:35.7902241Z * [new branch] feature/parallel-numa-binding -> origin/feature/parallel-numa-binding 2025-09-07T07:51:35.7904628Z * [new branch] fengyuan/external-proj -> origin/fengyuan/external-proj 2025-09-07T07:51:35.7906636Z * [new branch] fengyuan/out-of-tree-xpu-ops-improve-test -> origin/fengyuan/out-of-tree-xpu-ops-improve-test 2025-09-07T07:51:35.7908152Z * [new branch] fengyuan/out-of-tree-xpu-ops-remove-dtype -> origin/fengyuan/out-of-tree-xpu-ops-remove-dtype 2025-09-07T07:51:35.7909581Z * [new branch] fengyuan/test-xpu -> origin/fengyuan/test-xpu 2025-09-07T07:51:35.7911931Z * [new branch] ffast_math_baseline -> origin/ffast_math_baseline 2025-09-07T07:51:35.7913472Z * [new branch] ffast_math_target -> origin/ffast_math_target 2025-09-07T07:51:35.7916380Z * [new branch] findhao/base_commit -> origin/findhao/base_commit 2025-09-07T07:51:35.7918126Z * [new branch] findhao/base_commit1 -> origin/findhao/base_commit1 2025-09-07T07:51:35.7919416Z * [new branch] findhao/multistream2 -> origin/findhao/multistream2 2025-09-07T07:51:35.7920898Z * [new branch] findhao/multistream5 -> origin/findhao/multistream5 2025-09-07T07:51:35.7922366Z * [new branch] findhao/multistream6 -> origin/findhao/multistream6 2025-09-07T07:51:35.7923998Z * [new branch] findhao/operatorbench3 -> origin/findhao/operatorbench3 2025-09-07T07:51:35.7925869Z * [new branch] findhao/operatorbench5 -> origin/findhao/operatorbench5 2025-09-07T07:51:35.7927507Z * [new branch] findhao/tritonparse -> origin/findhao/tritonparse 2025-09-07T07:51:35.7929287Z * [new branch] fix -> origin/fix 2025-09-07T07:51:35.7931090Z * [new branch] fix-ck-gemm-template-format -> origin/fix-ck-gemm-template-format 2025-09-07T07:51:35.7932871Z * [new branch] fix-config-ignore -> origin/fix-config-ignore 2025-09-07T07:51:35.7934634Z * [new branch] fix-dict-guard -> origin/fix-dict-guard 2025-09-07T07:51:35.7936891Z * [new branch] fix-inductor-periodic-0528 -> origin/fix-inductor-periodic-0528 2025-09-07T07:51:35.7938587Z * [new branch] fix-mps-benchmark -> origin/fix-mps-benchmark 2025-09-07T07:51:35.7940357Z * [new branch] fix-rlease-feature-template -> origin/fix-rlease-feature-template 2025-09-07T07:51:35.7942368Z * [new branch] fix-run-condition-upload-results -> origin/fix-run-condition-upload-results 2025-09-07T07:51:35.7944041Z * [new branch] fix-torchbench -> origin/fix-torchbench 2025-09-07T07:51:35.7946127Z * [new branch] fix_153389 -> origin/fix_153389 2025-09-07T07:51:35.7948053Z * [new branch] fix_fsdp_rs_bucket2 -> origin/fix_fsdp_rs_bucket2 2025-09-07T07:51:35.7949866Z * [new branch] fix_inductor_peridic_tests -> origin/fix_inductor_peridic_tests 2025-09-07T07:51:35.7951524Z * [new branch] fix_ubn_159469 -> origin/fix_ubn_159469 2025-09-07T07:51:35.7953383Z * [new branch] fixes-triage -> origin/fixes-triage 2025-09-07T07:51:35.7955393Z * [new branch] fixflashinfer -> origin/fixflashinfer 2025-09-07T07:51:35.7957378Z * [new branch] flash_decoding_cpu -> origin/flash_decoding_cpu 2025-09-07T07:51:35.7959143Z * [new branch] flex-flash -> origin/flex-flash 2025-09-07T07:51:35.7960883Z * [new branch] flex-lowering -> origin/flex-lowering 2025-09-07T07:51:35.7962640Z * [new branch] flex-warning -> origin/flex-warning 2025-09-07T07:51:35.7964533Z * [new branch] flex_attention_functorch_grad -> origin/flex_attention_functorch_grad 2025-09-07T07:51:35.7966808Z * [new branch] flex_flash -> origin/flex_flash 2025-09-07T07:51:35.7968661Z * [new branch] flexdecode-gqa-groups -> origin/flexdecode-gqa-groups 2025-09-07T07:51:35.7971313Z * [new branch] fmassa/fix_memeff_sharding_rule -> origin/fmassa/fix_memeff_sharding_rule 2025-09-07T07:51:35.7973161Z * [new branch] fsdp2_trace_rules -> origin/fsdp2_trace_rules 2025-09-07T07:51:35.7974914Z * [new branch] fsdpv2_3d -> origin/fsdpv2_3d 2025-09-07T07:51:35.7977152Z * [new branch] fsdpv2_3d_m1 -> origin/fsdpv2_3d_m1 2025-09-07T07:51:35.7978986Z * [new branch] fx_cpp -> origin/fx_cpp 2025-09-07T07:51:35.7981514Z * [new branch] fy/fix-win -> origin/fy/fix-win 2025-09-07T07:51:35.7985499Z * [new branch] gh/AlnisM/1/base -> origin/gh/AlnisM/1/base 2025-09-07T07:51:35.7987343Z * [new branch] gh/AlnisM/1/head -> origin/gh/AlnisM/1/head 2025-09-07T07:51:35.7989907Z * [new branch] gh/CaoE/2/base -> origin/gh/CaoE/2/base 2025-09-07T07:51:35.7991419Z * [new branch] gh/CaoE/2/head -> origin/gh/CaoE/2/head 2025-09-07T07:51:35.7993027Z * [new branch] gh/CaoE/2/orig -> origin/gh/CaoE/2/orig 2025-09-07T07:51:35.7996527Z * [new branch] gh/ColinPeppler/79/base -> origin/gh/ColinPeppler/79/base 2025-09-07T07:51:35.7998237Z * [new branch] gh/ColinPeppler/79/head -> origin/gh/ColinPeppler/79/head 2025-09-07T07:51:35.7999907Z * [new branch] gh/ColinPeppler/79/orig -> origin/gh/ColinPeppler/79/orig 2025-09-07T07:51:35.8002328Z * [new branch] gh/ColinPeppler/80/base -> origin/gh/ColinPeppler/80/base 2025-09-07T07:51:35.8004043Z * [new branch] gh/ColinPeppler/80/head -> origin/gh/ColinPeppler/80/head 2025-09-07T07:51:35.8005945Z * [new branch] gh/ColinPeppler/80/orig -> origin/gh/ColinPeppler/80/orig 2025-09-07T07:51:35.8008832Z * [new branch] gh/EikanWang/67/base -> origin/gh/EikanWang/67/base 2025-09-07T07:51:35.8010372Z * [new branch] gh/EikanWang/67/head -> origin/gh/EikanWang/67/head 2025-09-07T07:51:35.8012690Z * [new branch] gh/EikanWang/80/base -> origin/gh/EikanWang/80/base 2025-09-07T07:51:35.8014338Z * [new branch] gh/EikanWang/80/head -> origin/gh/EikanWang/80/head 2025-09-07T07:51:35.8016289Z * [new branch] gh/EikanWang/80/orig -> origin/gh/EikanWang/80/orig 2025-09-07T07:51:35.8018587Z * [new branch] gh/EikanWang/81/base -> origin/gh/EikanWang/81/base 2025-09-07T07:51:35.8020218Z * [new branch] gh/EikanWang/81/head -> origin/gh/EikanWang/81/head 2025-09-07T07:51:35.8021908Z * [new branch] gh/EikanWang/81/orig -> origin/gh/EikanWang/81/orig 2025-09-07T07:51:35.8024135Z * [new branch] gh/EikanWang/82/base -> origin/gh/EikanWang/82/base 2025-09-07T07:51:35.8026112Z * [new branch] gh/EikanWang/82/head -> origin/gh/EikanWang/82/head 2025-09-07T07:51:35.8027722Z * [new branch] gh/EikanWang/82/orig -> origin/gh/EikanWang/82/orig 2025-09-07T07:51:35.8030807Z * [new branch] gh/Gasoonjia/1/base -> origin/gh/Gasoonjia/1/base 2025-09-07T07:51:35.8032454Z * [new branch] gh/Gasoonjia/1/head -> origin/gh/Gasoonjia/1/head 2025-09-07T07:51:35.8035658Z * [new branch] gh/H-Huang/131/base -> origin/gh/H-Huang/131/base 2025-09-07T07:51:35.8037322Z * [new branch] gh/H-Huang/131/head -> origin/gh/H-Huang/131/head 2025-09-07T07:51:35.8038930Z * [new branch] gh/H-Huang/131/orig -> origin/gh/H-Huang/131/orig 2025-09-07T07:51:35.8041144Z * [new branch] gh/H-Huang/132/base -> origin/gh/H-Huang/132/base 2025-09-07T07:51:35.8042763Z * [new branch] gh/H-Huang/132/head -> origin/gh/H-Huang/132/head 2025-09-07T07:51:35.8044293Z * [new branch] gh/H-Huang/132/orig -> origin/gh/H-Huang/132/orig 2025-09-07T07:51:35.8047025Z * [new branch] gh/H-Huang/180/base -> origin/gh/H-Huang/180/base 2025-09-07T07:51:35.8048476Z * [new branch] gh/H-Huang/180/head -> origin/gh/H-Huang/180/head 2025-09-07T07:51:35.8050026Z * [new branch] gh/H-Huang/180/orig -> origin/gh/H-Huang/180/orig 2025-09-07T07:51:35.8052342Z * [new branch] gh/H-Huang/182/base -> origin/gh/H-Huang/182/base 2025-09-07T07:51:35.8053880Z * [new branch] gh/H-Huang/182/head -> origin/gh/H-Huang/182/head 2025-09-07T07:51:35.8055822Z * [new branch] gh/H-Huang/182/orig -> origin/gh/H-Huang/182/orig 2025-09-07T07:51:35.8058409Z * [new branch] gh/H-Huang/187/base -> origin/gh/H-Huang/187/base 2025-09-07T07:51:35.8059844Z * [new branch] gh/H-Huang/187/head -> origin/gh/H-Huang/187/head 2025-09-07T07:51:35.8061268Z * [new branch] gh/H-Huang/187/orig -> origin/gh/H-Huang/187/orig 2025-09-07T07:51:35.8063702Z * [new branch] gh/H-Huang/202/base -> origin/gh/H-Huang/202/base 2025-09-07T07:51:35.8065533Z * [new branch] gh/H-Huang/202/head -> origin/gh/H-Huang/202/head 2025-09-07T07:51:35.8067104Z * [new branch] gh/H-Huang/202/orig -> origin/gh/H-Huang/202/orig 2025-09-07T07:51:35.8069447Z * [new branch] gh/H-Huang/203/base -> origin/gh/H-Huang/203/base 2025-09-07T07:51:35.8071065Z * [new branch] gh/H-Huang/203/head -> origin/gh/H-Huang/203/head 2025-09-07T07:51:35.8072835Z * [new branch] gh/H-Huang/203/orig -> origin/gh/H-Huang/203/orig 2025-09-07T07:51:35.8075407Z * [new branch] gh/H-Huang/204/base -> origin/gh/H-Huang/204/base 2025-09-07T07:51:35.8077130Z * [new branch] gh/H-Huang/204/head -> origin/gh/H-Huang/204/head 2025-09-07T07:51:35.8078657Z * [new branch] gh/H-Huang/204/orig -> origin/gh/H-Huang/204/orig 2025-09-07T07:51:35.8080899Z * [new branch] gh/H-Huang/205/base -> origin/gh/H-Huang/205/base 2025-09-07T07:51:35.8082532Z * [new branch] gh/H-Huang/205/head -> origin/gh/H-Huang/205/head 2025-09-07T07:51:35.8084131Z * [new branch] gh/H-Huang/205/orig -> origin/gh/H-Huang/205/orig 2025-09-07T07:51:35.8086708Z * [new branch] gh/H-Huang/206/base -> origin/gh/H-Huang/206/base 2025-09-07T07:51:35.8088224Z * [new branch] gh/H-Huang/206/head -> origin/gh/H-Huang/206/head 2025-09-07T07:51:35.8089752Z * [new branch] gh/H-Huang/206/orig -> origin/gh/H-Huang/206/orig 2025-09-07T07:51:35.8092137Z * [new branch] gh/H-Huang/207/base -> origin/gh/H-Huang/207/base 2025-09-07T07:51:35.8093683Z * [new branch] gh/H-Huang/207/head -> origin/gh/H-Huang/207/head 2025-09-07T07:51:35.8095587Z * [new branch] gh/H-Huang/207/orig -> origin/gh/H-Huang/207/orig 2025-09-07T07:51:35.8097875Z * [new branch] gh/H-Huang/208/base -> origin/gh/H-Huang/208/base 2025-09-07T07:51:35.8099420Z * [new branch] gh/H-Huang/208/head -> origin/gh/H-Huang/208/head 2025-09-07T07:51:35.8101061Z * [new branch] gh/H-Huang/208/orig -> origin/gh/H-Huang/208/orig 2025-09-07T07:51:35.8103475Z * [new branch] gh/H-Huang/209/base -> origin/gh/H-Huang/209/base 2025-09-07T07:51:35.8105181Z * [new branch] gh/H-Huang/209/head -> origin/gh/H-Huang/209/head 2025-09-07T07:51:35.8106962Z * [new branch] gh/H-Huang/209/orig -> origin/gh/H-Huang/209/orig 2025-09-07T07:51:35.8109257Z * [new branch] gh/H-Huang/210/base -> origin/gh/H-Huang/210/base 2025-09-07T07:51:35.8110827Z * [new branch] gh/H-Huang/210/head -> origin/gh/H-Huang/210/head 2025-09-07T07:51:35.8112457Z * [new branch] gh/H-Huang/210/orig -> origin/gh/H-Huang/210/orig 2025-09-07T07:51:35.8114735Z * [new branch] gh/H-Huang/211/base -> origin/gh/H-Huang/211/base 2025-09-07T07:51:35.8116654Z * [new branch] gh/H-Huang/211/head -> origin/gh/H-Huang/211/head 2025-09-07T07:51:35.8118109Z * [new branch] gh/H-Huang/211/orig -> origin/gh/H-Huang/211/orig 2025-09-07T07:51:35.8120365Z * [new branch] gh/H-Huang/212/base -> origin/gh/H-Huang/212/base 2025-09-07T07:51:35.8121925Z * [new branch] gh/H-Huang/212/head -> origin/gh/H-Huang/212/head 2025-09-07T07:51:35.8123728Z * [new branch] gh/H-Huang/212/orig -> origin/gh/H-Huang/212/orig 2025-09-07T07:51:35.8126378Z * [new branch] gh/H-Huang/213/base -> origin/gh/H-Huang/213/base 2025-09-07T07:51:35.8127921Z * [new branch] gh/H-Huang/213/head -> origin/gh/H-Huang/213/head 2025-09-07T07:51:35.8129460Z * [new branch] gh/H-Huang/213/orig -> origin/gh/H-Huang/213/orig 2025-09-07T07:51:35.8131745Z * [new branch] gh/H-Huang/214/base -> origin/gh/H-Huang/214/base 2025-09-07T07:51:35.8133292Z * [new branch] gh/H-Huang/214/head -> origin/gh/H-Huang/214/head 2025-09-07T07:51:35.8134867Z * [new branch] gh/H-Huang/214/orig -> origin/gh/H-Huang/214/orig 2025-09-07T07:51:35.8138147Z * [new branch] gh/IvanKobzarev/112/base -> origin/gh/IvanKobzarev/112/base 2025-09-07T07:51:35.8139700Z * [new branch] gh/IvanKobzarev/112/head -> origin/gh/IvanKobzarev/112/head 2025-09-07T07:51:35.8141315Z * [new branch] gh/IvanKobzarev/112/orig -> origin/gh/IvanKobzarev/112/orig 2025-09-07T07:51:35.8143853Z * [new branch] gh/IvanKobzarev/115/base -> origin/gh/IvanKobzarev/115/base 2025-09-07T07:51:35.8145737Z * [new branch] gh/IvanKobzarev/115/head -> origin/gh/IvanKobzarev/115/head 2025-09-07T07:51:35.8147446Z * [new branch] gh/IvanKobzarev/115/orig -> origin/gh/IvanKobzarev/115/orig 2025-09-07T07:51:35.8150067Z * [new branch] gh/IvanKobzarev/116/base -> origin/gh/IvanKobzarev/116/base 2025-09-07T07:51:35.8151761Z * [new branch] gh/IvanKobzarev/116/head -> origin/gh/IvanKobzarev/116/head 2025-09-07T07:51:35.8153292Z * [new branch] gh/IvanKobzarev/116/orig -> origin/gh/IvanKobzarev/116/orig 2025-09-07T07:51:35.8155997Z * [new branch] gh/IvanKobzarev/118/base -> origin/gh/IvanKobzarev/118/base 2025-09-07T07:51:35.8157555Z * [new branch] gh/IvanKobzarev/118/head -> origin/gh/IvanKobzarev/118/head 2025-09-07T07:51:35.8159072Z * [new branch] gh/IvanKobzarev/118/orig -> origin/gh/IvanKobzarev/118/orig 2025-09-07T07:51:35.8161584Z * [new branch] gh/IvanKobzarev/126/base -> origin/gh/IvanKobzarev/126/base 2025-09-07T07:51:35.8163248Z * [new branch] gh/IvanKobzarev/126/head -> origin/gh/IvanKobzarev/126/head 2025-09-07T07:51:35.8164808Z * [new branch] gh/IvanKobzarev/126/orig -> origin/gh/IvanKobzarev/126/orig 2025-09-07T07:51:35.8167597Z * [new branch] gh/IvanKobzarev/127/base -> origin/gh/IvanKobzarev/127/base 2025-09-07T07:51:35.8169338Z * [new branch] gh/IvanKobzarev/127/head -> origin/gh/IvanKobzarev/127/head 2025-09-07T07:51:35.8170933Z * [new branch] gh/IvanKobzarev/127/orig -> origin/gh/IvanKobzarev/127/orig 2025-09-07T07:51:35.8173276Z * [new branch] gh/IvanKobzarev/128/base -> origin/gh/IvanKobzarev/128/base 2025-09-07T07:51:35.8175109Z * [new branch] gh/IvanKobzarev/128/head -> origin/gh/IvanKobzarev/128/head 2025-09-07T07:51:35.8176712Z * [new branch] gh/IvanKobzarev/128/orig -> origin/gh/IvanKobzarev/128/orig 2025-09-07T07:51:35.8179195Z * [new branch] gh/IvanKobzarev/132/base -> origin/gh/IvanKobzarev/132/base 2025-09-07T07:51:35.8180752Z * [new branch] gh/IvanKobzarev/132/head -> origin/gh/IvanKobzarev/132/head 2025-09-07T07:51:35.8182524Z * [new branch] gh/IvanKobzarev/132/orig -> origin/gh/IvanKobzarev/132/orig 2025-09-07T07:51:35.8185449Z * [new branch] gh/IvanKobzarev/133/base -> origin/gh/IvanKobzarev/133/base 2025-09-07T07:51:35.8187401Z * [new branch] gh/IvanKobzarev/133/head -> origin/gh/IvanKobzarev/133/head 2025-09-07T07:51:35.8189003Z * [new branch] gh/IvanKobzarev/133/orig -> origin/gh/IvanKobzarev/133/orig 2025-09-07T07:51:35.8191498Z * [new branch] gh/IvanKobzarev/134/base -> origin/gh/IvanKobzarev/134/base 2025-09-07T07:51:35.8192860Z * [new branch] gh/IvanKobzarev/134/head -> origin/gh/IvanKobzarev/134/head 2025-09-07T07:51:35.8194316Z * [new branch] gh/IvanKobzarev/134/orig -> origin/gh/IvanKobzarev/134/orig 2025-09-07T07:51:35.8197287Z * [new branch] gh/IvanKobzarev/135/base -> origin/gh/IvanKobzarev/135/base 2025-09-07T07:51:35.8198898Z * [new branch] gh/IvanKobzarev/135/head -> origin/gh/IvanKobzarev/135/head 2025-09-07T07:51:35.8200421Z * [new branch] gh/IvanKobzarev/135/orig -> origin/gh/IvanKobzarev/135/orig 2025-09-07T07:51:35.8202855Z * [new branch] gh/IvanKobzarev/136/base -> origin/gh/IvanKobzarev/136/base 2025-09-07T07:51:35.8204547Z * [new branch] gh/IvanKobzarev/136/head -> origin/gh/IvanKobzarev/136/head 2025-09-07T07:51:35.8206441Z * [new branch] gh/IvanKobzarev/136/orig -> origin/gh/IvanKobzarev/136/orig 2025-09-07T07:51:35.8208592Z * [new branch] gh/IvanKobzarev/137/base -> origin/gh/IvanKobzarev/137/base 2025-09-07T07:51:35.8210238Z * [new branch] gh/IvanKobzarev/137/head -> origin/gh/IvanKobzarev/137/head 2025-09-07T07:51:35.8211766Z * [new branch] gh/IvanKobzarev/137/orig -> origin/gh/IvanKobzarev/137/orig 2025-09-07T07:51:35.8214174Z * [new branch] gh/IvanKobzarev/138/base -> origin/gh/IvanKobzarev/138/base 2025-09-07T07:51:35.8216307Z * [new branch] gh/IvanKobzarev/138/head -> origin/gh/IvanKobzarev/138/head 2025-09-07T07:51:35.8218000Z * [new branch] gh/IvanKobzarev/138/orig -> origin/gh/IvanKobzarev/138/orig 2025-09-07T07:51:35.8220372Z * [new branch] gh/IvanKobzarev/139/base -> origin/gh/IvanKobzarev/139/base 2025-09-07T07:51:35.8222116Z * [new branch] gh/IvanKobzarev/139/head -> origin/gh/IvanKobzarev/139/head 2025-09-07T07:51:35.8223664Z * [new branch] gh/IvanKobzarev/139/orig -> origin/gh/IvanKobzarev/139/orig 2025-09-07T07:51:35.8226499Z * [new branch] gh/IvanKobzarev/140/base -> origin/gh/IvanKobzarev/140/base 2025-09-07T07:51:35.8228090Z * [new branch] gh/IvanKobzarev/140/head -> origin/gh/IvanKobzarev/140/head 2025-09-07T07:51:35.8229718Z * [new branch] gh/IvanKobzarev/140/orig -> origin/gh/IvanKobzarev/140/orig 2025-09-07T07:51:35.8232168Z * [new branch] gh/IvanKobzarev/141/base -> origin/gh/IvanKobzarev/141/base 2025-09-07T07:51:35.8233872Z * [new branch] gh/IvanKobzarev/141/head -> origin/gh/IvanKobzarev/141/head 2025-09-07T07:51:35.8236312Z * [new branch] gh/IvanKobzarev/141/orig -> origin/gh/IvanKobzarev/141/orig 2025-09-07T07:51:35.8238550Z * [new branch] gh/IvanKobzarev/142/base -> origin/gh/IvanKobzarev/142/base 2025-09-07T07:51:35.8239889Z * [new branch] gh/IvanKobzarev/142/head -> origin/gh/IvanKobzarev/142/head 2025-09-07T07:51:35.8241428Z * [new branch] gh/IvanKobzarev/142/orig -> origin/gh/IvanKobzarev/142/orig 2025-09-07T07:51:35.8243875Z * [new branch] gh/IvanKobzarev/143/base -> origin/gh/IvanKobzarev/143/base 2025-09-07T07:51:35.8245755Z * [new branch] gh/IvanKobzarev/143/head -> origin/gh/IvanKobzarev/143/head 2025-09-07T07:51:35.8247459Z * [new branch] gh/IvanKobzarev/143/orig -> origin/gh/IvanKobzarev/143/orig 2025-09-07T07:51:35.8249864Z * [new branch] gh/IvanKobzarev/144/base -> origin/gh/IvanKobzarev/144/base 2025-09-07T07:51:35.8251451Z * [new branch] gh/IvanKobzarev/144/head -> origin/gh/IvanKobzarev/144/head 2025-09-07T07:51:35.8253019Z * [new branch] gh/IvanKobzarev/144/orig -> origin/gh/IvanKobzarev/144/orig 2025-09-07T07:51:35.8255608Z * [new branch] gh/IvanKobzarev/145/base -> origin/gh/IvanKobzarev/145/base 2025-09-07T07:51:35.8257531Z * [new branch] gh/IvanKobzarev/145/head -> origin/gh/IvanKobzarev/145/head 2025-09-07T07:51:35.8258917Z * [new branch] gh/IvanKobzarev/145/orig -> origin/gh/IvanKobzarev/145/orig 2025-09-07T07:51:35.8261254Z * [new branch] gh/IvanKobzarev/146/base -> origin/gh/IvanKobzarev/146/base 2025-09-07T07:51:35.8263021Z * [new branch] gh/IvanKobzarev/146/head -> origin/gh/IvanKobzarev/146/head 2025-09-07T07:51:35.8264593Z * [new branch] gh/IvanKobzarev/146/orig -> origin/gh/IvanKobzarev/146/orig 2025-09-07T07:51:35.8268110Z * [new branch] gh/NikhilAPatel/1/base -> origin/gh/NikhilAPatel/1/base 2025-09-07T07:51:35.8269796Z * [new branch] gh/NikhilAPatel/1/head -> origin/gh/NikhilAPatel/1/head 2025-09-07T07:51:35.8272006Z * [new branch] gh/NikhilAPatel/2/base -> origin/gh/NikhilAPatel/2/base 2025-09-07T07:51:35.8273543Z * [new branch] gh/NikhilAPatel/2/head -> origin/gh/NikhilAPatel/2/head 2025-09-07T07:51:35.8276406Z * [new branch] gh/NikhilAPatel/4/base -> origin/gh/NikhilAPatel/4/base 2025-09-07T07:51:35.8278043Z * [new branch] gh/NikhilAPatel/4/head -> origin/gh/NikhilAPatel/4/head 2025-09-07T07:51:35.8280840Z * [new branch] gh/PaliC/1/base -> origin/gh/PaliC/1/base 2025-09-07T07:51:35.8282398Z * [new branch] gh/PaliC/1/head -> origin/gh/PaliC/1/head 2025-09-07T07:51:35.8284019Z * [new branch] gh/PaliC/1/orig -> origin/gh/PaliC/1/orig 2025-09-07T07:51:35.8286674Z * [new branch] gh/PaliC/17/base -> origin/gh/PaliC/17/base 2025-09-07T07:51:35.8288241Z * [new branch] gh/PaliC/17/head -> origin/gh/PaliC/17/head 2025-09-07T07:51:35.8289942Z * [new branch] gh/PaliC/17/orig -> origin/gh/PaliC/17/orig 2025-09-07T07:51:35.8292224Z * [new branch] gh/PaliC/18/base -> origin/gh/PaliC/18/base 2025-09-07T07:51:35.8293728Z * [new branch] gh/PaliC/18/head -> origin/gh/PaliC/18/head 2025-09-07T07:51:35.8295490Z * [new branch] gh/PaliC/18/orig -> origin/gh/PaliC/18/orig 2025-09-07T07:51:35.8297814Z * [new branch] gh/PaliC/2/base -> origin/gh/PaliC/2/base 2025-09-07T07:51:35.8299430Z * [new branch] gh/PaliC/2/head -> origin/gh/PaliC/2/head 2025-09-07T07:51:35.8301022Z * [new branch] gh/PaliC/2/orig -> origin/gh/PaliC/2/orig 2025-09-07T07:51:35.8303488Z * [new branch] gh/PaliC/20/base -> origin/gh/PaliC/20/base 2025-09-07T07:51:35.8305263Z * [new branch] gh/PaliC/20/head -> origin/gh/PaliC/20/head 2025-09-07T07:51:35.8306958Z * [new branch] gh/PaliC/20/orig -> origin/gh/PaliC/20/orig 2025-09-07T07:51:35.8309189Z * [new branch] gh/PaliC/21/base -> origin/gh/PaliC/21/base 2025-09-07T07:51:35.8310840Z * [new branch] gh/PaliC/21/head -> origin/gh/PaliC/21/head 2025-09-07T07:51:35.8312399Z * [new branch] gh/PaliC/21/orig -> origin/gh/PaliC/21/orig 2025-09-07T07:51:35.8314627Z * [new branch] gh/PaliC/22/base -> origin/gh/PaliC/22/base 2025-09-07T07:51:35.8316488Z * [new branch] gh/PaliC/22/head -> origin/gh/PaliC/22/head 2025-09-07T07:51:35.8318043Z * [new branch] gh/PaliC/22/orig -> origin/gh/PaliC/22/orig 2025-09-07T07:51:35.8320173Z * [new branch] gh/PaliC/23/base -> origin/gh/PaliC/23/base 2025-09-07T07:51:35.8321864Z * [new branch] gh/PaliC/23/head -> origin/gh/PaliC/23/head 2025-09-07T07:51:35.8323421Z * [new branch] gh/PaliC/23/orig -> origin/gh/PaliC/23/orig 2025-09-07T07:51:35.8326360Z * [new branch] gh/PaliC/24/base -> origin/gh/PaliC/24/base 2025-09-07T07:51:35.8327912Z * [new branch] gh/PaliC/24/head -> origin/gh/PaliC/24/head 2025-09-07T07:51:35.8329321Z * [new branch] gh/PaliC/24/orig -> origin/gh/PaliC/24/orig 2025-09-07T07:51:35.8332164Z * [new branch] gh/PaulZhang12/17/base -> origin/gh/PaulZhang12/17/base 2025-09-07T07:51:35.8333713Z * [new branch] gh/PaulZhang12/17/head -> origin/gh/PaulZhang12/17/head 2025-09-07T07:51:35.8336563Z * [new branch] gh/PaulZhang12/20/base -> origin/gh/PaulZhang12/20/base 2025-09-07T07:51:35.8338343Z * [new branch] gh/PaulZhang12/20/head -> origin/gh/PaulZhang12/20/head 2025-09-07T07:51:35.8339833Z * [new branch] gh/PaulZhang12/20/orig -> origin/gh/PaulZhang12/20/orig 2025-09-07T07:51:35.8342220Z * [new branch] gh/PaulZhang12/21/base -> origin/gh/PaulZhang12/21/base 2025-09-07T07:51:35.8343819Z * [new branch] gh/PaulZhang12/21/head -> origin/gh/PaulZhang12/21/head 2025-09-07T07:51:35.8345820Z * [new branch] gh/PaulZhang12/21/orig -> origin/gh/PaulZhang12/21/orig 2025-09-07T07:51:35.8348135Z * [new branch] gh/PaulZhang12/22/base -> origin/gh/PaulZhang12/22/base 2025-09-07T07:51:35.8349731Z * [new branch] gh/PaulZhang12/22/head -> origin/gh/PaulZhang12/22/head 2025-09-07T07:51:35.8351314Z * [new branch] gh/PaulZhang12/22/orig -> origin/gh/PaulZhang12/22/orig 2025-09-07T07:51:35.8353629Z * [new branch] gh/PaulZhang12/23/base -> origin/gh/PaulZhang12/23/base 2025-09-07T07:51:35.8355586Z * [new branch] gh/PaulZhang12/23/head -> origin/gh/PaulZhang12/23/head 2025-09-07T07:51:35.8357294Z * [new branch] gh/PaulZhang12/23/orig -> origin/gh/PaulZhang12/23/orig 2025-09-07T07:51:35.8359526Z * [new branch] gh/PaulZhang12/24/base -> origin/gh/PaulZhang12/24/base 2025-09-07T07:51:35.8361111Z * [new branch] gh/PaulZhang12/24/head -> origin/gh/PaulZhang12/24/head 2025-09-07T07:51:35.8362824Z * [new branch] gh/PaulZhang12/24/orig -> origin/gh/PaulZhang12/24/orig 2025-09-07T07:51:35.8365583Z * [new branch] gh/PaulZhang12/25/base -> origin/gh/PaulZhang12/25/base 2025-09-07T07:51:35.8367098Z * [new branch] gh/PaulZhang12/25/head -> origin/gh/PaulZhang12/25/head 2025-09-07T07:51:35.8368937Z * [new branch] gh/PaulZhang12/25/orig -> origin/gh/PaulZhang12/25/orig 2025-09-07T07:51:35.8371516Z * [new branch] gh/SamGinzburg/11/base -> origin/gh/SamGinzburg/11/base 2025-09-07T07:51:35.8373209Z * [new branch] gh/SamGinzburg/11/head -> origin/gh/SamGinzburg/11/head 2025-09-07T07:51:35.8376448Z * [new branch] gh/Sidharth123-cpu/24/base -> origin/gh/Sidharth123-cpu/24/base 2025-09-07T07:51:35.8378606Z * [new branch] gh/Sidharth123-cpu/25/base -> origin/gh/Sidharth123-cpu/25/base 2025-09-07T07:51:35.8380776Z * [new branch] gh/Sidharth123-cpu/26/base -> origin/gh/Sidharth123-cpu/26/base 2025-09-07T07:51:35.8383120Z * [new branch] gh/Sidharth123-cpu/27/base -> origin/gh/Sidharth123-cpu/27/base 2025-09-07T07:51:35.8386147Z * [new branch] gh/StrongerXi/1/base -> origin/gh/StrongerXi/1/base 2025-09-07T07:51:35.8387728Z * [new branch] gh/StrongerXi/1/head -> origin/gh/StrongerXi/1/head 2025-09-07T07:51:35.8390103Z * [new branch] gh/StrongerXi/133/base -> origin/gh/StrongerXi/133/base 2025-09-07T07:51:35.8391665Z * [new branch] gh/StrongerXi/133/head -> origin/gh/StrongerXi/133/head 2025-09-07T07:51:35.8393366Z * [new branch] gh/StrongerXi/133/orig -> origin/gh/StrongerXi/133/orig 2025-09-07T07:51:35.8395776Z * [new branch] gh/StrongerXi/134/base -> origin/gh/StrongerXi/134/base 2025-09-07T07:51:35.8397559Z * [new branch] gh/StrongerXi/134/head -> origin/gh/StrongerXi/134/head 2025-09-07T07:51:35.8398982Z * [new branch] gh/StrongerXi/134/orig -> origin/gh/StrongerXi/134/orig 2025-09-07T07:51:35.8401199Z * [new branch] gh/StrongerXi/136/base -> origin/gh/StrongerXi/136/base 2025-09-07T07:51:35.8402821Z * [new branch] gh/StrongerXi/136/head -> origin/gh/StrongerXi/136/head 2025-09-07T07:51:35.8404381Z * [new branch] gh/StrongerXi/136/orig -> origin/gh/StrongerXi/136/orig 2025-09-07T07:51:35.8406891Z * [new branch] gh/StrongerXi/137/base -> origin/gh/StrongerXi/137/base 2025-09-07T07:51:35.8408450Z * [new branch] gh/StrongerXi/137/head -> origin/gh/StrongerXi/137/head 2025-09-07T07:51:35.8410082Z * [new branch] gh/StrongerXi/137/orig -> origin/gh/StrongerXi/137/orig 2025-09-07T07:51:35.8412289Z * [new branch] gh/StrongerXi/138/base -> origin/gh/StrongerXi/138/base 2025-09-07T07:51:35.8413940Z * [new branch] gh/StrongerXi/138/head -> origin/gh/StrongerXi/138/head 2025-09-07T07:51:35.8416163Z * [new branch] gh/StrongerXi/138/orig -> origin/gh/StrongerXi/138/orig 2025-09-07T07:51:35.8418457Z * [new branch] gh/StrongerXi/139/base -> origin/gh/StrongerXi/139/base 2025-09-07T07:51:35.8420004Z * [new branch] gh/StrongerXi/139/head -> origin/gh/StrongerXi/139/head 2025-09-07T07:51:35.8421760Z * [new branch] gh/StrongerXi/139/orig -> origin/gh/StrongerXi/139/orig 2025-09-07T07:51:35.8424066Z * [new branch] gh/StrongerXi/140/base -> origin/gh/StrongerXi/140/base 2025-09-07T07:51:35.8425868Z * [new branch] gh/StrongerXi/140/head -> origin/gh/StrongerXi/140/head 2025-09-07T07:51:35.8427574Z * [new branch] gh/StrongerXi/140/orig -> origin/gh/StrongerXi/140/orig 2025-09-07T07:51:35.8429901Z * [new branch] gh/StrongerXi/71/base -> origin/gh/StrongerXi/71/base 2025-09-07T07:51:35.8431509Z * [new branch] gh/StrongerXi/71/head -> origin/gh/StrongerXi/71/head 2025-09-07T07:51:35.8433677Z * [new branch] gh/StrongerXi/72/base -> origin/gh/StrongerXi/72/base 2025-09-07T07:51:35.8435591Z * [new branch] gh/StrongerXi/72/head -> origin/gh/StrongerXi/72/head 2025-09-07T07:51:35.8438453Z * [new branch] gh/XilunWu/133/base -> origin/gh/XilunWu/133/base 2025-09-07T07:51:35.8439990Z * [new branch] gh/XilunWu/133/head -> origin/gh/XilunWu/133/head 2025-09-07T07:51:35.8441596Z * [new branch] gh/XilunWu/133/orig -> origin/gh/XilunWu/133/orig 2025-09-07T07:51:35.8443952Z * [new branch] gh/XilunWu/139/base -> origin/gh/XilunWu/139/base 2025-09-07T07:51:35.8445808Z * [new branch] gh/XilunWu/139/head -> origin/gh/XilunWu/139/head 2025-09-07T07:51:35.8447428Z * [new branch] gh/XilunWu/139/orig -> origin/gh/XilunWu/139/orig 2025-09-07T07:51:35.8449814Z * [new branch] gh/XilunWu/143/base -> origin/gh/XilunWu/143/base 2025-09-07T07:51:35.8451396Z * [new branch] gh/XilunWu/143/head -> origin/gh/XilunWu/143/head 2025-09-07T07:51:35.8452998Z * [new branch] gh/XilunWu/143/orig -> origin/gh/XilunWu/143/orig 2025-09-07T07:51:35.8455797Z * [new branch] gh/XilunWu/144/base -> origin/gh/XilunWu/144/base 2025-09-07T07:51:35.8457203Z * [new branch] gh/XilunWu/144/head -> origin/gh/XilunWu/144/head 2025-09-07T07:51:35.8458738Z * [new branch] gh/XilunWu/144/orig -> origin/gh/XilunWu/144/orig 2025-09-07T07:51:35.8461075Z * [new branch] gh/XilunWu/145/base -> origin/gh/XilunWu/145/base 2025-09-07T07:51:35.8462826Z * [new branch] gh/XilunWu/145/head -> origin/gh/XilunWu/145/head 2025-09-07T07:51:35.8464708Z * [new branch] gh/XilunWu/145/orig -> origin/gh/XilunWu/145/orig 2025-09-07T07:51:35.8467112Z * [new branch] gh/XilunWu/146/base -> origin/gh/XilunWu/146/base 2025-09-07T07:51:35.8468732Z * [new branch] gh/XilunWu/146/head -> origin/gh/XilunWu/146/head 2025-09-07T07:51:35.8470269Z * [new branch] gh/XilunWu/146/orig -> origin/gh/XilunWu/146/orig 2025-09-07T07:51:35.8472591Z * [new branch] gh/XilunWu/147/base -> origin/gh/XilunWu/147/base 2025-09-07T07:51:35.8474196Z * [new branch] gh/XilunWu/147/head -> origin/gh/XilunWu/147/head 2025-09-07T07:51:35.8476089Z * [new branch] gh/XilunWu/147/orig -> origin/gh/XilunWu/147/orig 2025-09-07T07:51:35.8478178Z * [new branch] gh/XilunWu/148/base -> origin/gh/XilunWu/148/base 2025-09-07T07:51:35.8479825Z * [new branch] gh/XilunWu/148/head -> origin/gh/XilunWu/148/head 2025-09-07T07:51:35.8481349Z * [new branch] gh/XilunWu/148/orig -> origin/gh/XilunWu/148/orig 2025-09-07T07:51:35.8483521Z * [new branch] gh/XilunWu/149/base -> origin/gh/XilunWu/149/base 2025-09-07T07:51:35.8485275Z * [new branch] gh/XilunWu/149/head -> origin/gh/XilunWu/149/head 2025-09-07T07:51:35.8487066Z * [new branch] gh/XilunWu/149/orig -> origin/gh/XilunWu/149/orig 2025-09-07T07:51:35.8489217Z * [new branch] gh/XilunWu/150/base -> origin/gh/XilunWu/150/base 2025-09-07T07:51:35.8490747Z * [new branch] gh/XilunWu/150/head -> origin/gh/XilunWu/150/head 2025-09-07T07:51:35.8492332Z * [new branch] gh/XilunWu/150/orig -> origin/gh/XilunWu/150/orig 2025-09-07T07:51:35.8494710Z * [new branch] gh/XilunWu/151/base -> origin/gh/XilunWu/151/base 2025-09-07T07:51:35.8496769Z * [new branch] gh/XilunWu/151/head -> origin/gh/XilunWu/151/head 2025-09-07T07:51:35.8498361Z * [new branch] gh/XilunWu/151/orig -> origin/gh/XilunWu/151/orig 2025-09-07T07:51:35.8500571Z * [new branch] gh/XilunWu/152/base -> origin/gh/XilunWu/152/base 2025-09-07T07:51:35.8502278Z * [new branch] gh/XilunWu/152/head -> origin/gh/XilunWu/152/head 2025-09-07T07:51:35.8503771Z * [new branch] gh/XilunWu/152/orig -> origin/gh/XilunWu/152/orig 2025-09-07T07:51:35.8506599Z * [new branch] gh/XilunWu/153/base -> origin/gh/XilunWu/153/base 2025-09-07T07:51:35.8508252Z * [new branch] gh/XilunWu/153/head -> origin/gh/XilunWu/153/head 2025-09-07T07:51:35.8509803Z * [new branch] gh/XilunWu/153/orig -> origin/gh/XilunWu/153/orig 2025-09-07T07:51:35.8512290Z * [new branch] gh/XilunWu/160/base -> origin/gh/XilunWu/160/base 2025-09-07T07:51:35.8513754Z * [new branch] gh/XilunWu/160/head -> origin/gh/XilunWu/160/head 2025-09-07T07:51:35.8515538Z * [new branch] gh/XilunWu/160/orig -> origin/gh/XilunWu/160/orig 2025-09-07T07:51:35.8518010Z * [new branch] gh/XilunWu/161/base -> origin/gh/XilunWu/161/base 2025-09-07T07:51:35.8519502Z * [new branch] gh/XilunWu/161/head -> origin/gh/XilunWu/161/head 2025-09-07T07:51:35.8521113Z * [new branch] gh/XilunWu/161/orig -> origin/gh/XilunWu/161/orig 2025-09-07T07:51:35.8523401Z * [new branch] gh/XilunWu/163/base -> origin/gh/XilunWu/163/base 2025-09-07T07:51:35.8525235Z * [new branch] gh/XilunWu/163/head -> origin/gh/XilunWu/163/head 2025-09-07T07:51:35.8526985Z * [new branch] gh/XilunWu/163/orig -> origin/gh/XilunWu/163/orig 2025-09-07T07:51:35.8529339Z * [new branch] gh/XilunWu/164/base -> origin/gh/XilunWu/164/base 2025-09-07T07:51:35.8531177Z * [new branch] gh/XilunWu/164/head -> origin/gh/XilunWu/164/head 2025-09-07T07:51:35.8532608Z * [new branch] gh/XilunWu/164/orig -> origin/gh/XilunWu/164/orig 2025-09-07T07:51:35.8535107Z * [new branch] gh/XilunWu/165/base -> origin/gh/XilunWu/165/base 2025-09-07T07:51:35.8536987Z * [new branch] gh/XilunWu/165/head -> origin/gh/XilunWu/165/head 2025-09-07T07:51:35.8538578Z * [new branch] gh/XilunWu/165/orig -> origin/gh/XilunWu/165/orig 2025-09-07T07:51:35.8540925Z * [new branch] gh/XilunWu/166/base -> origin/gh/XilunWu/166/base 2025-09-07T07:51:35.8542712Z * [new branch] gh/XilunWu/166/head -> origin/gh/XilunWu/166/head 2025-09-07T07:51:35.8544255Z * [new branch] gh/XilunWu/166/orig -> origin/gh/XilunWu/166/orig 2025-09-07T07:51:35.8547119Z * [new branch] gh/XilunWu/167/base -> origin/gh/XilunWu/167/base 2025-09-07T07:51:35.8548539Z * [new branch] gh/XilunWu/167/head -> origin/gh/XilunWu/167/head 2025-09-07T07:51:35.8550135Z * [new branch] gh/XilunWu/167/orig -> origin/gh/XilunWu/167/orig 2025-09-07T07:51:35.8552468Z * [new branch] gh/XilunWu/168/base -> origin/gh/XilunWu/168/base 2025-09-07T07:51:35.8554015Z * [new branch] gh/XilunWu/168/head -> origin/gh/XilunWu/168/head 2025-09-07T07:51:35.8555824Z * [new branch] gh/XilunWu/168/orig -> origin/gh/XilunWu/168/orig 2025-09-07T07:51:35.8558098Z * [new branch] gh/XilunWu/169/base -> origin/gh/XilunWu/169/base 2025-09-07T07:51:35.8559939Z * [new branch] gh/XilunWu/169/head -> origin/gh/XilunWu/169/head 2025-09-07T07:51:35.8561531Z * [new branch] gh/XilunWu/169/orig -> origin/gh/XilunWu/169/orig 2025-09-07T07:51:35.8563692Z * [new branch] gh/XilunWu/170/base -> origin/gh/XilunWu/170/base 2025-09-07T07:51:35.8565546Z * [new branch] gh/XilunWu/170/head -> origin/gh/XilunWu/170/head 2025-09-07T07:51:35.8567180Z * [new branch] gh/XilunWu/170/orig -> origin/gh/XilunWu/170/orig 2025-09-07T07:51:35.8570068Z * [new branch] gh/XuehaiPan/14/base -> origin/gh/XuehaiPan/14/base 2025-09-07T07:51:35.8571792Z * [new branch] gh/XuehaiPan/14/head -> origin/gh/XuehaiPan/14/head 2025-09-07T07:51:35.8573250Z * [new branch] gh/XuehaiPan/14/orig -> origin/gh/XuehaiPan/14/orig 2025-09-07T07:51:35.8575898Z * [new branch] gh/XuehaiPan/179/base -> origin/gh/XuehaiPan/179/base 2025-09-07T07:51:35.8577465Z * [new branch] gh/XuehaiPan/179/head -> origin/gh/XuehaiPan/179/head 2025-09-07T07:51:35.8579086Z * [new branch] gh/XuehaiPan/179/orig -> origin/gh/XuehaiPan/179/orig 2025-09-07T07:51:35.8581628Z * [new branch] gh/XuehaiPan/189/base -> origin/gh/XuehaiPan/189/base 2025-09-07T07:51:35.8583374Z * [new branch] gh/XuehaiPan/189/head -> origin/gh/XuehaiPan/189/head 2025-09-07T07:51:35.8584881Z * [new branch] gh/XuehaiPan/189/orig -> origin/gh/XuehaiPan/189/orig 2025-09-07T07:51:35.8587496Z * [new branch] gh/XuehaiPan/232/base -> origin/gh/XuehaiPan/232/base 2025-09-07T07:51:35.8589130Z * [new branch] gh/XuehaiPan/232/head -> origin/gh/XuehaiPan/232/head 2025-09-07T07:51:35.8590677Z * [new branch] gh/XuehaiPan/232/orig -> origin/gh/XuehaiPan/232/orig 2025-09-07T07:51:35.8592996Z * [new branch] gh/XuehaiPan/249/base -> origin/gh/XuehaiPan/249/base 2025-09-07T07:51:35.8594679Z * [new branch] gh/XuehaiPan/249/head -> origin/gh/XuehaiPan/249/head 2025-09-07T07:51:35.8596513Z * [new branch] gh/XuehaiPan/249/orig -> origin/gh/XuehaiPan/249/orig 2025-09-07T07:51:35.8598855Z * [new branch] gh/XuehaiPan/253/base -> origin/gh/XuehaiPan/253/base 2025-09-07T07:51:35.8600289Z * [new branch] gh/XuehaiPan/253/head -> origin/gh/XuehaiPan/253/head 2025-09-07T07:51:35.8601922Z * [new branch] gh/XuehaiPan/253/orig -> origin/gh/XuehaiPan/253/orig 2025-09-07T07:51:35.8604095Z * [new branch] gh/XuehaiPan/254/base -> origin/gh/XuehaiPan/254/base 2025-09-07T07:51:35.8606029Z * [new branch] gh/XuehaiPan/254/head -> origin/gh/XuehaiPan/254/head 2025-09-07T07:51:35.8607649Z * [new branch] gh/XuehaiPan/254/orig -> origin/gh/XuehaiPan/254/orig 2025-09-07T07:51:35.8609850Z * [new branch] gh/XuehaiPan/255/base -> origin/gh/XuehaiPan/255/base 2025-09-07T07:51:35.8611523Z * [new branch] gh/XuehaiPan/255/head -> origin/gh/XuehaiPan/255/head 2025-09-07T07:51:35.8613001Z * [new branch] gh/XuehaiPan/255/orig -> origin/gh/XuehaiPan/255/orig 2025-09-07T07:51:35.8615488Z * [new branch] gh/XuehaiPan/257/base -> origin/gh/XuehaiPan/257/base 2025-09-07T07:51:35.8617170Z * [new branch] gh/XuehaiPan/257/head -> origin/gh/XuehaiPan/257/head 2025-09-07T07:51:35.8618661Z * [new branch] gh/XuehaiPan/257/orig -> origin/gh/XuehaiPan/257/orig 2025-09-07T07:51:35.8621177Z * [new branch] gh/XuehaiPan/271/base -> origin/gh/XuehaiPan/271/base 2025-09-07T07:51:35.8622708Z * [new branch] gh/XuehaiPan/271/head -> origin/gh/XuehaiPan/271/head 2025-09-07T07:51:35.8624242Z * [new branch] gh/XuehaiPan/271/orig -> origin/gh/XuehaiPan/271/orig 2025-09-07T07:51:35.8626819Z * [new branch] gh/XuehaiPan/290/base -> origin/gh/XuehaiPan/290/base 2025-09-07T07:51:35.8628652Z * [new branch] gh/XuehaiPan/290/head -> origin/gh/XuehaiPan/290/head 2025-09-07T07:51:35.8630079Z * [new branch] gh/XuehaiPan/290/orig -> origin/gh/XuehaiPan/290/orig 2025-09-07T07:51:35.8632232Z * [new branch] gh/XuehaiPan/343/base -> origin/gh/XuehaiPan/343/base 2025-09-07T07:51:35.8633948Z * [new branch] gh/XuehaiPan/343/head -> origin/gh/XuehaiPan/343/head 2025-09-07T07:51:35.8635668Z * [new branch] gh/XuehaiPan/343/orig -> origin/gh/XuehaiPan/343/orig 2025-09-07T07:51:35.8638088Z * [new branch] gh/XuehaiPan/347/base -> origin/gh/XuehaiPan/347/base 2025-09-07T07:51:35.8639757Z * [new branch] gh/XuehaiPan/347/head -> origin/gh/XuehaiPan/347/head 2025-09-07T07:51:35.8641336Z * [new branch] gh/XuehaiPan/347/orig -> origin/gh/XuehaiPan/347/orig 2025-09-07T07:51:35.8643466Z * [new branch] gh/XuehaiPan/348/base -> origin/gh/XuehaiPan/348/base 2025-09-07T07:51:35.8645189Z * [new branch] gh/XuehaiPan/348/head -> origin/gh/XuehaiPan/348/head 2025-09-07T07:51:35.8646997Z * [new branch] gh/XuehaiPan/348/orig -> origin/gh/XuehaiPan/348/orig 2025-09-07T07:51:35.8649223Z * [new branch] gh/XuehaiPan/350/base -> origin/gh/XuehaiPan/350/base 2025-09-07T07:51:35.8650894Z * [new branch] gh/XuehaiPan/350/head -> origin/gh/XuehaiPan/350/head 2025-09-07T07:51:35.8652520Z * [new branch] gh/XuehaiPan/350/orig -> origin/gh/XuehaiPan/350/orig 2025-09-07T07:51:35.8655179Z * [new branch] gh/XuehaiPan/356/base -> origin/gh/XuehaiPan/356/base 2025-09-07T07:51:35.8657004Z * [new branch] gh/XuehaiPan/356/head -> origin/gh/XuehaiPan/356/head 2025-09-07T07:51:35.8658564Z * [new branch] gh/XuehaiPan/356/orig -> origin/gh/XuehaiPan/356/orig 2025-09-07T07:51:35.8660917Z * [new branch] gh/XuehaiPan/357/base -> origin/gh/XuehaiPan/357/base 2025-09-07T07:51:35.8662690Z * [new branch] gh/XuehaiPan/357/head -> origin/gh/XuehaiPan/357/head 2025-09-07T07:51:35.8664365Z * [new branch] gh/XuehaiPan/357/orig -> origin/gh/XuehaiPan/357/orig 2025-09-07T07:51:35.8666884Z * [new branch] gh/XuehaiPan/358/base -> origin/gh/XuehaiPan/358/base 2025-09-07T07:51:35.8668454Z * [new branch] gh/XuehaiPan/358/head -> origin/gh/XuehaiPan/358/head 2025-09-07T07:51:35.8670061Z * [new branch] gh/XuehaiPan/358/orig -> origin/gh/XuehaiPan/358/orig 2025-09-07T07:51:35.8672220Z * [new branch] gh/XuehaiPan/359/base -> origin/gh/XuehaiPan/359/base 2025-09-07T07:51:35.8673800Z * [new branch] gh/XuehaiPan/359/head -> origin/gh/XuehaiPan/359/head 2025-09-07T07:51:35.8675567Z * [new branch] gh/XuehaiPan/359/orig -> origin/gh/XuehaiPan/359/orig 2025-09-07T07:51:35.8677965Z * [new branch] gh/XuehaiPan/360/base -> origin/gh/XuehaiPan/360/base 2025-09-07T07:51:35.8679539Z * [new branch] gh/XuehaiPan/360/head -> origin/gh/XuehaiPan/360/head 2025-09-07T07:51:35.8681178Z * [new branch] gh/XuehaiPan/360/orig -> origin/gh/XuehaiPan/360/orig 2025-09-07T07:51:35.8683464Z * [new branch] gh/XuehaiPan/365/base -> origin/gh/XuehaiPan/365/base 2025-09-07T07:51:35.8685175Z * [new branch] gh/XuehaiPan/365/head -> origin/gh/XuehaiPan/365/head 2025-09-07T07:51:35.8687022Z * [new branch] gh/XuehaiPan/365/orig -> origin/gh/XuehaiPan/365/orig 2025-09-07T07:51:35.8689280Z * [new branch] gh/XuehaiPan/366/base -> origin/gh/XuehaiPan/366/base 2025-09-07T07:51:35.8690838Z * [new branch] gh/XuehaiPan/366/head -> origin/gh/XuehaiPan/366/head 2025-09-07T07:51:35.8693139Z * [new branch] gh/XuehaiPan/369/base -> origin/gh/XuehaiPan/369/base 2025-09-07T07:51:35.8694728Z * [new branch] gh/XuehaiPan/369/head -> origin/gh/XuehaiPan/369/head 2025-09-07T07:51:35.8696572Z * [new branch] gh/XuehaiPan/369/orig -> origin/gh/XuehaiPan/369/orig 2025-09-07T07:51:35.8698897Z * [new branch] gh/XuehaiPan/370/base -> origin/gh/XuehaiPan/370/base 2025-09-07T07:51:35.8700484Z * [new branch] gh/XuehaiPan/370/head -> origin/gh/XuehaiPan/370/head 2025-09-07T07:51:35.8702190Z * [new branch] gh/XuehaiPan/370/orig -> origin/gh/XuehaiPan/370/orig 2025-09-07T07:51:35.8704469Z * [new branch] gh/XuehaiPan/380/base -> origin/gh/XuehaiPan/380/base 2025-09-07T07:51:35.8706340Z * [new branch] gh/XuehaiPan/380/head -> origin/gh/XuehaiPan/380/head 2025-09-07T07:51:35.8707838Z * [new branch] gh/XuehaiPan/380/orig -> origin/gh/XuehaiPan/380/orig 2025-09-07T07:51:35.8710142Z * [new branch] gh/XuehaiPan/381/base -> origin/gh/XuehaiPan/381/base 2025-09-07T07:51:35.8711729Z * [new branch] gh/XuehaiPan/381/head -> origin/gh/XuehaiPan/381/head 2025-09-07T07:51:35.8714067Z * [new branch] gh/XuehaiPan/382/base -> origin/gh/XuehaiPan/382/base 2025-09-07T07:51:35.8716074Z * [new branch] gh/XuehaiPan/382/head -> origin/gh/XuehaiPan/382/head 2025-09-07T07:51:35.8717677Z * [new branch] gh/XuehaiPan/382/orig -> origin/gh/XuehaiPan/382/orig 2025-09-07T07:51:35.8719988Z * [new branch] gh/XuehaiPan/383/base -> origin/gh/XuehaiPan/383/base 2025-09-07T07:51:35.8721536Z * [new branch] gh/XuehaiPan/383/head -> origin/gh/XuehaiPan/383/head 2025-09-07T07:51:35.8723084Z * [new branch] gh/XuehaiPan/383/orig -> origin/gh/XuehaiPan/383/orig 2025-09-07T07:51:35.8725507Z * [new branch] gh/XuehaiPan/384/base -> origin/gh/XuehaiPan/384/base 2025-09-07T07:51:35.8727185Z * [new branch] gh/XuehaiPan/384/head -> origin/gh/XuehaiPan/384/head 2025-09-07T07:51:35.8728803Z * [new branch] gh/XuehaiPan/384/orig -> origin/gh/XuehaiPan/384/orig 2025-09-07T07:51:35.8731288Z * [new branch] gh/XuehaiPan/385/base -> origin/gh/XuehaiPan/385/base 2025-09-07T07:51:35.8732777Z * [new branch] gh/XuehaiPan/385/head -> origin/gh/XuehaiPan/385/head 2025-09-07T07:51:35.8734249Z * [new branch] gh/XuehaiPan/385/orig -> origin/gh/XuehaiPan/385/orig 2025-09-07T07:51:35.8736808Z * [new branch] gh/XuehaiPan/386/base -> origin/gh/XuehaiPan/386/base 2025-09-07T07:51:35.8738439Z * [new branch] gh/XuehaiPan/386/head -> origin/gh/XuehaiPan/386/head 2025-09-07T07:51:35.8739956Z * [new branch] gh/XuehaiPan/386/orig -> origin/gh/XuehaiPan/386/orig 2025-09-07T07:51:35.8742298Z * [new branch] gh/XuehaiPan/387/base -> origin/gh/XuehaiPan/387/base 2025-09-07T07:51:35.8743864Z * [new branch] gh/XuehaiPan/387/head -> origin/gh/XuehaiPan/387/head 2025-09-07T07:51:35.8745547Z * [new branch] gh/XuehaiPan/387/orig -> origin/gh/XuehaiPan/387/orig 2025-09-07T07:51:35.8748445Z * [new branch] gh/ZainRizvi/1/base -> origin/gh/ZainRizvi/1/base 2025-09-07T07:51:35.8750227Z * [new branch] gh/ZainRizvi/1/head -> origin/gh/ZainRizvi/1/head 2025-09-07T07:51:35.8752408Z * [new branch] gh/ZainRizvi/2/base -> origin/gh/ZainRizvi/2/base 2025-09-07T07:51:35.8754003Z * [new branch] gh/ZainRizvi/2/head -> origin/gh/ZainRizvi/2/head 2025-09-07T07:51:35.8756664Z * [new branch] gh/ZainRizvi/3/base -> origin/gh/ZainRizvi/3/base 2025-09-07T07:51:35.8758147Z * [new branch] gh/ZainRizvi/3/head -> origin/gh/ZainRizvi/3/head 2025-09-07T07:51:35.8760413Z * [new branch] gh/ZainRizvi/4/base -> origin/gh/ZainRizvi/4/base 2025-09-07T07:51:35.8762051Z * [new branch] gh/ZainRizvi/4/head -> origin/gh/ZainRizvi/4/head 2025-09-07T07:51:35.8764113Z * [new branch] gh/ZainRizvi/5/base -> origin/gh/ZainRizvi/5/base 2025-09-07T07:51:35.8765821Z * [new branch] gh/ZainRizvi/5/head -> origin/gh/ZainRizvi/5/head 2025-09-07T07:51:35.8768107Z * [new branch] gh/ZainRizvi/6/base -> origin/gh/ZainRizvi/6/base 2025-09-07T07:51:35.8769690Z * [new branch] gh/ZainRizvi/6/head -> origin/gh/ZainRizvi/6/head 2025-09-07T07:51:35.8771352Z * [new branch] gh/ZainRizvi/6/orig -> origin/gh/ZainRizvi/6/orig 2025-09-07T07:51:35.8773625Z * [new branch] gh/ZainRizvi/7/base -> origin/gh/ZainRizvi/7/base 2025-09-07T07:51:35.8775310Z * [new branch] gh/ZainRizvi/7/head -> origin/gh/ZainRizvi/7/head 2025-09-07T07:51:35.8776968Z * [new branch] gh/ZainRizvi/7/orig -> origin/gh/ZainRizvi/7/orig 2025-09-07T07:51:35.8779269Z * [new branch] gh/ZainRizvi/8/base -> origin/gh/ZainRizvi/8/base 2025-09-07T07:51:35.8780928Z * [new branch] gh/ZainRizvi/8/head -> origin/gh/ZainRizvi/8/head 2025-09-07T07:51:35.8783228Z * [new branch] gh/ZainRizvi/9/base -> origin/gh/ZainRizvi/9/base 2025-09-07T07:51:35.8784718Z * [new branch] gh/ZainRizvi/9/head -> origin/gh/ZainRizvi/9/head 2025-09-07T07:51:35.8786580Z * [new branch] gh/ZainRizvi/9/orig -> origin/gh/ZainRizvi/9/orig 2025-09-07T07:51:35.8789364Z * [new branch] gh/ZhiweiYan-96/39/base -> origin/gh/ZhiweiYan-96/39/base 2025-09-07T07:51:35.8791212Z * [new branch] gh/ZhiweiYan-96/39/head -> origin/gh/ZhiweiYan-96/39/head 2025-09-07T07:51:35.8792712Z * [new branch] gh/ZhiweiYan-96/39/orig -> origin/gh/ZhiweiYan-96/39/orig 2025-09-07T07:51:35.8795148Z * [new branch] gh/ZhiweiYan-96/44/base -> origin/gh/ZhiweiYan-96/44/base 2025-09-07T07:51:35.8796965Z * [new branch] gh/ZhiweiYan-96/44/head -> origin/gh/ZhiweiYan-96/44/head 2025-09-07T07:51:35.8799256Z * [new branch] gh/ZhiweiYan-96/45/base -> origin/gh/ZhiweiYan-96/45/base 2025-09-07T07:51:35.8800656Z * [new branch] gh/ZhiweiYan-96/45/head -> origin/gh/ZhiweiYan-96/45/head 2025-09-07T07:51:35.8803011Z * [new branch] gh/ZhiweiYan-96/49/base -> origin/gh/ZhiweiYan-96/49/base 2025-09-07T07:51:35.8804639Z * [new branch] gh/ZhiweiYan-96/49/head -> origin/gh/ZhiweiYan-96/49/head 2025-09-07T07:51:35.8807225Z * [new branch] gh/ZhiweiYan-96/62/base -> origin/gh/ZhiweiYan-96/62/base 2025-09-07T07:51:35.8808761Z * [new branch] gh/ZhiweiYan-96/62/head -> origin/gh/ZhiweiYan-96/62/head 2025-09-07T07:51:35.8811012Z * [new branch] gh/ZhiweiYan-96/64/base -> origin/gh/ZhiweiYan-96/64/base 2025-09-07T07:51:35.8812561Z * [new branch] gh/ZhiweiYan-96/64/head -> origin/gh/ZhiweiYan-96/64/head 2025-09-07T07:51:35.8814215Z * [new branch] gh/ZhiweiYan-96/64/orig -> origin/gh/ZhiweiYan-96/64/orig 2025-09-07T07:51:35.8816685Z * [new branch] gh/ZhiweiYan-96/65/base -> origin/gh/ZhiweiYan-96/65/base 2025-09-07T07:51:35.8818264Z * [new branch] gh/ZhiweiYan-96/65/head -> origin/gh/ZhiweiYan-96/65/head 2025-09-07T07:51:35.8819862Z * [new branch] gh/ZhiweiYan-96/65/orig -> origin/gh/ZhiweiYan-96/65/orig 2025-09-07T07:51:35.8822236Z * [new branch] gh/ZhiweiYan-96/66/base -> origin/gh/ZhiweiYan-96/66/base 2025-09-07T07:51:35.8823717Z * [new branch] gh/ZhiweiYan-96/66/head -> origin/gh/ZhiweiYan-96/66/head 2025-09-07T07:51:35.8826367Z * [new branch] gh/ZhiweiYan-96/67/base -> origin/gh/ZhiweiYan-96/67/base 2025-09-07T07:51:35.8827936Z * [new branch] gh/ZhiweiYan-96/67/head -> origin/gh/ZhiweiYan-96/67/head 2025-09-07T07:51:35.8830133Z * [new branch] gh/ZhiweiYan-96/68/base -> origin/gh/ZhiweiYan-96/68/base 2025-09-07T07:51:35.8831732Z * [new branch] gh/ZhiweiYan-96/68/head -> origin/gh/ZhiweiYan-96/68/head 2025-09-07T07:51:35.8833201Z * [new branch] gh/ZhiweiYan-96/68/orig -> origin/gh/ZhiweiYan-96/68/orig 2025-09-07T07:51:35.8836282Z * [new branch] gh/aakhundov/1/base -> origin/gh/aakhundov/1/base 2025-09-07T07:51:35.8837913Z * [new branch] gh/aakhundov/1/head -> origin/gh/aakhundov/1/head 2025-09-07T07:51:35.8840024Z * [new branch] gh/aakhundov/2/base -> origin/gh/aakhundov/2/base 2025-09-07T07:51:35.8841707Z * [new branch] gh/aakhundov/2/head -> origin/gh/aakhundov/2/head 2025-09-07T07:51:35.8844063Z * [new branch] gh/aditew01/openblas -> origin/gh/aditew01/openblas 2025-09-07T07:51:35.8845875Z * [new branch] gh/aditew01/sbgemm -> origin/gh/aditew01/sbgemm 2025-09-07T07:51:35.8847657Z * [new branch] gh/aditew01/vecbf16 -> origin/gh/aditew01/vecbf16 2025-09-07T07:51:35.8850035Z * [new branch] gh/alexbrauckmann/paddedtensor_faketensor_init -> origin/gh/alexbrauckmann/paddedtensor_faketensor_init 2025-09-07T07:51:35.8852771Z * [new branch] gh/alexsamardzic/9/base -> origin/gh/alexsamardzic/9/base 2025-09-07T07:51:35.8854218Z * [new branch] gh/alexsamardzic/9/head -> origin/gh/alexsamardzic/9/head 2025-09-07T07:51:35.8856160Z * [new branch] gh/alexsamardzic/9/orig -> origin/gh/alexsamardzic/9/orig 2025-09-07T07:51:35.8858913Z * [new branch] gh/amjames/18/base -> origin/gh/amjames/18/base 2025-09-07T07:51:35.8860534Z * [new branch] gh/amjames/18/head -> origin/gh/amjames/18/head 2025-09-07T07:51:35.8862167Z * [new branch] gh/amjames/18/orig -> origin/gh/amjames/18/orig 2025-09-07T07:51:35.8865337Z * [new branch] gh/andrewor14/35/base -> origin/gh/andrewor14/35/base 2025-09-07T07:51:35.8867830Z * [new branch] gh/andrewor14/35/head -> origin/gh/andrewor14/35/head 2025-09-07T07:51:35.8868766Z * [new branch] gh/andrewor14/35/orig -> origin/gh/andrewor14/35/orig 2025-09-07T07:51:35.8871217Z * [new branch] gh/andrewor14/50/base -> origin/gh/andrewor14/50/base 2025-09-07T07:51:35.8872887Z * [new branch] gh/andrewor14/50/head -> origin/gh/andrewor14/50/head 2025-09-07T07:51:35.8874478Z * [new branch] gh/andrewor14/50/orig -> origin/gh/andrewor14/50/orig 2025-09-07T07:51:35.8877114Z * [new branch] gh/andrewor14/51/base -> origin/gh/andrewor14/51/base 2025-09-07T07:51:35.8878801Z * [new branch] gh/andrewor14/51/orig -> origin/gh/andrewor14/51/orig 2025-09-07T07:51:35.8881640Z * [new branch] gh/andyanwang/1/base -> origin/gh/andyanwang/1/base 2025-09-07T07:51:35.8883185Z * [new branch] gh/andyanwang/1/head -> origin/gh/andyanwang/1/head 2025-09-07T07:51:35.8884800Z * [new branch] gh/andyanwang/1/orig -> origin/gh/andyanwang/1/orig 2025-09-07T07:51:35.8887471Z * [new branch] gh/andyanwang/13/base -> origin/gh/andyanwang/13/base 2025-09-07T07:51:35.8889102Z * [new branch] gh/andyanwang/13/head -> origin/gh/andyanwang/13/head 2025-09-07T07:51:35.8891146Z * [new branch] gh/andyanwang/13/orig -> origin/gh/andyanwang/13/orig 2025-09-07T07:51:35.8893460Z * [new branch] gh/andyanwang/2/base -> origin/gh/andyanwang/2/base 2025-09-07T07:51:35.8895136Z * [new branch] gh/andyanwang/2/head -> origin/gh/andyanwang/2/head 2025-09-07T07:51:35.8896892Z * [new branch] gh/andyanwang/2/orig -> origin/gh/andyanwang/2/orig 2025-09-07T07:51:35.8899211Z * [new branch] gh/andyanwang/28/base -> origin/gh/andyanwang/28/base 2025-09-07T07:51:35.8900792Z * [new branch] gh/andyanwang/28/head -> origin/gh/andyanwang/28/head 2025-09-07T07:51:35.8902574Z * [new branch] gh/andyanwang/28/orig -> origin/gh/andyanwang/28/orig 2025-09-07T07:51:35.8904731Z * [new branch] gh/andyanwang/3/base -> origin/gh/andyanwang/3/base 2025-09-07T07:51:35.8906576Z * [new branch] gh/andyanwang/3/head -> origin/gh/andyanwang/3/head 2025-09-07T07:51:35.8908202Z * [new branch] gh/andyanwang/3/orig -> origin/gh/andyanwang/3/orig 2025-09-07T07:51:35.8910451Z * [new branch] gh/andyanwang/30/base -> origin/gh/andyanwang/30/base 2025-09-07T07:51:35.8912197Z * [new branch] gh/andyanwang/30/orig -> origin/gh/andyanwang/30/orig 2025-09-07T07:51:35.8914419Z * [new branch] gh/andyanwang/31/base -> origin/gh/andyanwang/31/base 2025-09-07T07:51:35.8916484Z * [new branch] gh/andyanwang/31/orig -> origin/gh/andyanwang/31/orig 2025-09-07T07:51:35.8918985Z * [new branch] gh/andyanwang/32/base -> origin/gh/andyanwang/32/base 2025-09-07T07:51:35.8920665Z * [new branch] gh/andyanwang/32/head -> origin/gh/andyanwang/32/head 2025-09-07T07:51:35.8922375Z * [new branch] gh/andyanwang/32/orig -> origin/gh/andyanwang/32/orig 2025-09-07T07:51:35.8924678Z * [new branch] gh/andyanwang/39/base -> origin/gh/andyanwang/39/base 2025-09-07T07:51:35.8926779Z * [new branch] gh/andyanwang/39/head -> origin/gh/andyanwang/39/head 2025-09-07T07:51:35.8928387Z * [new branch] gh/andyanwang/39/orig -> origin/gh/andyanwang/39/orig 2025-09-07T07:51:35.8930702Z * [new branch] gh/andyanwang/4/base -> origin/gh/andyanwang/4/base 2025-09-07T07:51:35.8932256Z * [new branch] gh/andyanwang/4/head -> origin/gh/andyanwang/4/head 2025-09-07T07:51:35.8934012Z * [new branch] gh/andyanwang/4/orig -> origin/gh/andyanwang/4/orig 2025-09-07T07:51:35.8937309Z * [new branch] gh/angelayi/107/base -> origin/gh/angelayi/107/base 2025-09-07T07:51:35.8938742Z * [new branch] gh/angelayi/107/head -> origin/gh/angelayi/107/head 2025-09-07T07:51:35.8940956Z * [new branch] gh/angelayi/111/base -> origin/gh/angelayi/111/base 2025-09-07T07:51:35.8942724Z * [new branch] gh/angelayi/111/head -> origin/gh/angelayi/111/head 2025-09-07T07:51:35.8944246Z * [new branch] gh/angelayi/111/orig -> origin/gh/angelayi/111/orig 2025-09-07T07:51:35.8946801Z * [new branch] gh/angelayi/112/base -> origin/gh/angelayi/112/base 2025-09-07T07:51:35.8948701Z * [new branch] gh/angelayi/112/head -> origin/gh/angelayi/112/head 2025-09-07T07:51:35.8950381Z * [new branch] gh/angelayi/112/orig -> origin/gh/angelayi/112/orig 2025-09-07T07:51:35.8952694Z * [new branch] gh/angelayi/113/base -> origin/gh/angelayi/113/base 2025-09-07T07:51:35.8954230Z * [new branch] gh/angelayi/113/head -> origin/gh/angelayi/113/head 2025-09-07T07:51:35.8956172Z * [new branch] gh/angelayi/113/orig -> origin/gh/angelayi/113/orig 2025-09-07T07:51:35.8958438Z * [new branch] gh/angelayi/114/base -> origin/gh/angelayi/114/base 2025-09-07T07:51:35.8960022Z * [new branch] gh/angelayi/114/head -> origin/gh/angelayi/114/head 2025-09-07T07:51:35.8962122Z * [new branch] gh/angelayi/114/orig -> origin/gh/angelayi/114/orig 2025-09-07T07:51:35.8963927Z * [new branch] gh/angelayi/115/base -> origin/gh/angelayi/115/base 2025-09-07T07:51:35.8965683Z * [new branch] gh/angelayi/115/head -> origin/gh/angelayi/115/head 2025-09-07T07:51:35.8967287Z * [new branch] gh/angelayi/115/orig -> origin/gh/angelayi/115/orig 2025-09-07T07:51:35.8970194Z * [new branch] gh/anijain2305/753/base -> origin/gh/anijain2305/753/base 2025-09-07T07:51:35.8971902Z * [new branch] gh/anijain2305/753/head -> origin/gh/anijain2305/753/head 2025-09-07T07:51:35.8973427Z * [new branch] gh/anijain2305/753/orig -> origin/gh/anijain2305/753/orig 2025-09-07T07:51:35.8976052Z * [new branch] gh/anijain2305/766/base -> origin/gh/anijain2305/766/base 2025-09-07T07:51:35.8977669Z * [new branch] gh/anijain2305/766/head -> origin/gh/anijain2305/766/head 2025-09-07T07:51:35.8979134Z * [new branch] gh/anijain2305/766/orig -> origin/gh/anijain2305/766/orig 2025-09-07T07:51:35.8981528Z * [new branch] gh/anijain2305/790/base -> origin/gh/anijain2305/790/base 2025-09-07T07:51:35.8983238Z * [new branch] gh/anijain2305/790/head -> origin/gh/anijain2305/790/head 2025-09-07T07:51:35.8984747Z * [new branch] gh/anijain2305/790/orig -> origin/gh/anijain2305/790/orig 2025-09-07T07:51:35.9003148Z * [new branch] gh/anijain2305/792/base -> origin/gh/anijain2305/792/base 2025-09-07T07:51:35.9003724Z * [new branch] gh/anijain2305/792/head -> origin/gh/anijain2305/792/head 2025-09-07T07:51:35.9004194Z * [new branch] gh/anijain2305/792/orig -> origin/gh/anijain2305/792/orig 2025-09-07T07:51:35.9004647Z * [new branch] gh/anijain2305/803/base -> origin/gh/anijain2305/803/base 2025-09-07T07:51:35.9005232Z * [new branch] gh/anijain2305/803/head -> origin/gh/anijain2305/803/head 2025-09-07T07:51:35.9005684Z * [new branch] gh/anijain2305/803/orig -> origin/gh/anijain2305/803/orig 2025-09-07T07:51:35.9006143Z * [new branch] gh/anijain2305/804/base -> origin/gh/anijain2305/804/base 2025-09-07T07:51:35.9006598Z * [new branch] gh/anijain2305/804/head -> origin/gh/anijain2305/804/head 2025-09-07T07:51:35.9007293Z * [new branch] gh/anijain2305/804/orig -> origin/gh/anijain2305/804/orig 2025-09-07T07:51:35.9007960Z * [new branch] gh/anijain2305/805/base -> origin/gh/anijain2305/805/base 2025-09-07T07:51:35.9008415Z * [new branch] gh/anijain2305/805/head -> origin/gh/anijain2305/805/head 2025-09-07T07:51:35.9008843Z * [new branch] gh/anijain2305/805/orig -> origin/gh/anijain2305/805/orig 2025-09-07T07:51:35.9010210Z * [new branch] gh/anijain2305/810/base -> origin/gh/anijain2305/810/base 2025-09-07T07:51:35.9011737Z * [new branch] gh/anijain2305/810/head -> origin/gh/anijain2305/810/head 2025-09-07T07:51:35.9013308Z * [new branch] gh/anijain2305/810/orig -> origin/gh/anijain2305/810/orig 2025-09-07T07:51:35.9015989Z * [new branch] gh/anijain2305/812/base -> origin/gh/anijain2305/812/base 2025-09-07T07:51:35.9017682Z * [new branch] gh/anijain2305/812/head -> origin/gh/anijain2305/812/head 2025-09-07T07:51:35.9019421Z * [new branch] gh/anijain2305/812/orig -> origin/gh/anijain2305/812/orig 2025-09-07T07:51:35.9021736Z * [new branch] gh/anijain2305/838/base -> origin/gh/anijain2305/838/base 2025-09-07T07:51:35.9023419Z * [new branch] gh/anijain2305/838/head -> origin/gh/anijain2305/838/head 2025-09-07T07:51:35.9025106Z * [new branch] gh/anijain2305/838/orig -> origin/gh/anijain2305/838/orig 2025-09-07T07:51:35.9027523Z * [new branch] gh/anijain2305/839/base -> origin/gh/anijain2305/839/base 2025-09-07T07:51:35.9029096Z * [new branch] gh/anijain2305/839/head -> origin/gh/anijain2305/839/head 2025-09-07T07:51:35.9030706Z * [new branch] gh/anijain2305/839/orig -> origin/gh/anijain2305/839/orig 2025-09-07T07:51:35.9032921Z * [new branch] gh/anijain2305/843/base -> origin/gh/anijain2305/843/base 2025-09-07T07:51:35.9034505Z * [new branch] gh/anijain2305/843/head -> origin/gh/anijain2305/843/head 2025-09-07T07:51:35.9036644Z * [new branch] gh/anijain2305/843/orig -> origin/gh/anijain2305/843/orig 2025-09-07T07:51:35.9038620Z * [new branch] gh/anijain2305/844/base -> origin/gh/anijain2305/844/base 2025-09-07T07:51:35.9040234Z * [new branch] gh/anijain2305/844/head -> origin/gh/anijain2305/844/head 2025-09-07T07:51:35.9041709Z * [new branch] gh/anijain2305/844/orig -> origin/gh/anijain2305/844/orig 2025-09-07T07:51:35.9044178Z * [new branch] gh/anijain2305/846/base -> origin/gh/anijain2305/846/base 2025-09-07T07:51:35.9046151Z * [new branch] gh/anijain2305/846/head -> origin/gh/anijain2305/846/head 2025-09-07T07:51:35.9047729Z * [new branch] gh/anijain2305/846/orig -> origin/gh/anijain2305/846/orig 2025-09-07T07:51:35.9050183Z * [new branch] gh/anijain2305/848/base -> origin/gh/anijain2305/848/base 2025-09-07T07:51:35.9051823Z * [new branch] gh/anijain2305/848/head -> origin/gh/anijain2305/848/head 2025-09-07T07:51:35.9053435Z * [new branch] gh/anijain2305/848/orig -> origin/gh/anijain2305/848/orig 2025-09-07T07:51:35.9055995Z * [new branch] gh/anijain2305/849/base -> origin/gh/anijain2305/849/base 2025-09-07T07:51:35.9057535Z * [new branch] gh/anijain2305/849/head -> origin/gh/anijain2305/849/head 2025-09-07T07:51:35.9059379Z * [new branch] gh/anijain2305/849/orig -> origin/gh/anijain2305/849/orig 2025-09-07T07:51:35.9061576Z * [new branch] gh/anijain2305/850/base -> origin/gh/anijain2305/850/base 2025-09-07T07:51:35.9063186Z * [new branch] gh/anijain2305/850/head -> origin/gh/anijain2305/850/head 2025-09-07T07:51:35.9064703Z * [new branch] gh/anijain2305/850/orig -> origin/gh/anijain2305/850/orig 2025-09-07T07:51:35.9067279Z * [new branch] gh/anijain2305/851/base -> origin/gh/anijain2305/851/base 2025-09-07T07:51:35.9069112Z * [new branch] gh/anijain2305/851/head -> origin/gh/anijain2305/851/head 2025-09-07T07:51:35.9070499Z * [new branch] gh/anijain2305/851/orig -> origin/gh/anijain2305/851/orig 2025-09-07T07:51:35.9072857Z * [new branch] gh/anijain2305/852/base -> origin/gh/anijain2305/852/base 2025-09-07T07:51:35.9074403Z * [new branch] gh/anijain2305/852/head -> origin/gh/anijain2305/852/head 2025-09-07T07:51:35.9076366Z * [new branch] gh/anijain2305/852/orig -> origin/gh/anijain2305/852/orig 2025-09-07T07:51:35.9078591Z * [new branch] gh/anijain2305/853/base -> origin/gh/anijain2305/853/base 2025-09-07T07:51:35.9080124Z * [new branch] gh/anijain2305/853/head -> origin/gh/anijain2305/853/head 2025-09-07T07:51:35.9081730Z * [new branch] gh/anijain2305/853/orig -> origin/gh/anijain2305/853/orig 2025-09-07T07:51:35.9083944Z * [new branch] gh/anijain2305/854/base -> origin/gh/anijain2305/854/base 2025-09-07T07:51:35.9085847Z * [new branch] gh/anijain2305/854/head -> origin/gh/anijain2305/854/head 2025-09-07T07:51:35.9087477Z * [new branch] gh/anijain2305/854/orig -> origin/gh/anijain2305/854/orig 2025-09-07T07:51:35.9089797Z * [new branch] gh/anijain2305/855/base -> origin/gh/anijain2305/855/base 2025-09-07T07:51:35.9091468Z * [new branch] gh/anijain2305/855/head -> origin/gh/anijain2305/855/head 2025-09-07T07:51:35.9093029Z * [new branch] gh/anijain2305/855/orig -> origin/gh/anijain2305/855/orig 2025-09-07T07:51:35.9095414Z * [new branch] gh/anijain2305/856/base -> origin/gh/anijain2305/856/base 2025-09-07T07:51:35.9097256Z * [new branch] gh/anijain2305/856/head -> origin/gh/anijain2305/856/head 2025-09-07T07:51:35.9098738Z * [new branch] gh/anijain2305/856/orig -> origin/gh/anijain2305/856/orig 2025-09-07T07:51:35.9101063Z * [new branch] gh/anijain2305/857/base -> origin/gh/anijain2305/857/base 2025-09-07T07:51:35.9102903Z * [new branch] gh/anijain2305/857/head -> origin/gh/anijain2305/857/head 2025-09-07T07:51:35.9104248Z * [new branch] gh/anijain2305/857/orig -> origin/gh/anijain2305/857/orig 2025-09-07T07:51:35.9107149Z * [new branch] gh/anijain2305/858/base -> origin/gh/anijain2305/858/base 2025-09-07T07:51:35.9108753Z * [new branch] gh/anijain2305/858/head -> origin/gh/anijain2305/858/head 2025-09-07T07:51:35.9110296Z * [new branch] gh/anijain2305/858/orig -> origin/gh/anijain2305/858/orig 2025-09-07T07:51:35.9112744Z * [new branch] gh/anijain2305/859/base -> origin/gh/anijain2305/859/base 2025-09-07T07:51:35.9114265Z * [new branch] gh/anijain2305/859/head -> origin/gh/anijain2305/859/head 2025-09-07T07:51:35.9116123Z * [new branch] gh/anijain2305/859/orig -> origin/gh/anijain2305/859/orig 2025-09-07T07:51:35.9118346Z * [new branch] gh/anijain2305/860/base -> origin/gh/anijain2305/860/base 2025-09-07T07:51:35.9119999Z * [new branch] gh/anijain2305/860/head -> origin/gh/anijain2305/860/head 2025-09-07T07:51:35.9121471Z * [new branch] gh/anijain2305/860/orig -> origin/gh/anijain2305/860/orig 2025-09-07T07:51:35.9123850Z * [new branch] gh/anijain2305/861/base -> origin/gh/anijain2305/861/base 2025-09-07T07:51:35.9125764Z * [new branch] gh/anijain2305/861/head -> origin/gh/anijain2305/861/head 2025-09-07T07:51:35.9127391Z * [new branch] gh/anijain2305/861/orig -> origin/gh/anijain2305/861/orig 2025-09-07T07:51:35.9129666Z * [new branch] gh/anijain2305/862/base -> origin/gh/anijain2305/862/base 2025-09-07T07:51:35.9131336Z * [new branch] gh/anijain2305/862/head -> origin/gh/anijain2305/862/head 2025-09-07T07:51:35.9133109Z * [new branch] gh/anijain2305/862/orig -> origin/gh/anijain2305/862/orig 2025-09-07T07:51:35.9135419Z * [new branch] gh/anijain2305/863/base -> origin/gh/anijain2305/863/base 2025-09-07T07:51:35.9137143Z * [new branch] gh/anijain2305/863/head -> origin/gh/anijain2305/863/head 2025-09-07T07:51:35.9138892Z * [new branch] gh/anijain2305/863/orig -> origin/gh/anijain2305/863/orig 2025-09-07T07:51:35.9141227Z * [new branch] gh/anijain2305/864/base -> origin/gh/anijain2305/864/base 2025-09-07T07:51:35.9142950Z * [new branch] gh/anijain2305/864/head -> origin/gh/anijain2305/864/head 2025-09-07T07:51:35.9144521Z * [new branch] gh/anijain2305/864/orig -> origin/gh/anijain2305/864/orig 2025-09-07T07:51:35.9147260Z * [new branch] gh/anijain2305/865/base -> origin/gh/anijain2305/865/base 2025-09-07T07:51:35.9148886Z * [new branch] gh/anijain2305/865/head -> origin/gh/anijain2305/865/head 2025-09-07T07:51:35.9150412Z * [new branch] gh/anijain2305/865/orig -> origin/gh/anijain2305/865/orig 2025-09-07T07:51:35.9152677Z * [new branch] gh/anijain2305/866/base -> origin/gh/anijain2305/866/base 2025-09-07T07:51:35.9154274Z * [new branch] gh/anijain2305/866/head -> origin/gh/anijain2305/866/head 2025-09-07T07:51:35.9156186Z * [new branch] gh/anijain2305/866/orig -> origin/gh/anijain2305/866/orig 2025-09-07T07:51:35.9159014Z * [new branch] gh/anjali411/216/base -> origin/gh/anjali411/216/base 2025-09-07T07:51:35.9160621Z * [new branch] gh/anjali411/216/head -> origin/gh/anjali411/216/head 2025-09-07T07:51:35.9162174Z * [new branch] gh/anjali411/216/orig -> origin/gh/anjali411/216/orig 2025-09-07T07:51:35.9165176Z * [new branch] gh/ankitageorge/13/base -> origin/gh/ankitageorge/13/base 2025-09-07T07:51:35.9166960Z * [new branch] gh/ankitageorge/13/head -> origin/gh/ankitageorge/13/head 2025-09-07T07:51:35.9168600Z * [new branch] gh/ankitageorge/13/orig -> origin/gh/ankitageorge/13/orig 2025-09-07T07:51:35.9171084Z * [new branch] gh/ankitageorge/14/base -> origin/gh/ankitageorge/14/base 2025-09-07T07:51:35.9172593Z * [new branch] gh/ankitageorge/14/head -> origin/gh/ankitageorge/14/head 2025-09-07T07:51:35.9174374Z * [new branch] gh/ankitageorge/14/orig -> origin/gh/ankitageorge/14/orig 2025-09-07T07:51:35.9177191Z * [new branch] gh/ankitageorge/15/base -> origin/gh/ankitageorge/15/base 2025-09-07T07:51:35.9178774Z * [new branch] gh/ankitageorge/15/head -> origin/gh/ankitageorge/15/head 2025-09-07T07:51:35.9180396Z * [new branch] gh/ankitageorge/15/orig -> origin/gh/ankitageorge/15/orig 2025-09-07T07:51:35.9182960Z * [new branch] gh/ankitageorge/16/base -> origin/gh/ankitageorge/16/base 2025-09-07T07:51:35.9184594Z * [new branch] gh/ankitageorge/16/head -> origin/gh/ankitageorge/16/head 2025-09-07T07:51:35.9186539Z * [new branch] gh/ankitageorge/16/orig -> origin/gh/ankitageorge/16/orig 2025-09-07T07:51:35.9188882Z * [new branch] gh/ankitageorge/17/base -> origin/gh/ankitageorge/17/base 2025-09-07T07:51:35.9190440Z * [new branch] gh/ankitageorge/17/head -> origin/gh/ankitageorge/17/head 2025-09-07T07:51:35.9192142Z * [new branch] gh/ankitageorge/17/orig -> origin/gh/ankitageorge/17/orig 2025-09-07T07:51:35.9194532Z * [new branch] gh/ankitageorge/21/base -> origin/gh/ankitageorge/21/base 2025-09-07T07:51:35.9196444Z * [new branch] gh/ankitageorge/21/head -> origin/gh/ankitageorge/21/head 2025-09-07T07:51:35.9197984Z * [new branch] gh/ankitageorge/21/orig -> origin/gh/ankitageorge/21/orig 2025-09-07T07:51:35.9201158Z * [new branch] gh/anshul-si/1/base -> origin/gh/anshul-si/1/base 2025-09-07T07:51:35.9202704Z * [new branch] gh/anshul-si/1/head -> origin/gh/anshul-si/1/head 2025-09-07T07:51:35.9204854Z * [new branch] gh/anshul-si/15/base -> origin/gh/anshul-si/15/base 2025-09-07T07:51:35.9206782Z * [new branch] gh/anshul-si/15/head -> origin/gh/anshul-si/15/head 2025-09-07T07:51:35.9208425Z * [new branch] gh/anshul-si/15/orig -> origin/gh/anshul-si/15/orig 2025-09-07T07:51:35.9210765Z * [new branch] gh/anshul-si/16/base -> origin/gh/anshul-si/16/base 2025-09-07T07:51:35.9212351Z * [new branch] gh/anshul-si/16/head -> origin/gh/anshul-si/16/head 2025-09-07T07:51:35.9214157Z * [new branch] gh/anshul-si/16/orig -> origin/gh/anshul-si/16/orig 2025-09-07T07:51:35.9216623Z * [new branch] gh/anshul-si/17/base -> origin/gh/anshul-si/17/base 2025-09-07T07:51:35.9218344Z * [new branch] gh/anshul-si/17/head -> origin/gh/anshul-si/17/head 2025-09-07T07:51:35.9220250Z * [new branch] gh/anshul-si/17/orig -> origin/gh/anshul-si/17/orig 2025-09-07T07:51:35.9222563Z * [new branch] gh/anshul-si/18/base -> origin/gh/anshul-si/18/base 2025-09-07T07:51:35.9224336Z * [new branch] gh/anshul-si/18/head -> origin/gh/anshul-si/18/head 2025-09-07T07:51:35.9226225Z * [new branch] gh/anshul-si/18/orig -> origin/gh/anshul-si/18/orig 2025-09-07T07:51:35.9228533Z * [new branch] gh/anshul-si/19/base -> origin/gh/anshul-si/19/base 2025-09-07T07:51:35.9230156Z * [new branch] gh/anshul-si/19/head -> origin/gh/anshul-si/19/head 2025-09-07T07:51:35.9231761Z * [new branch] gh/anshul-si/19/orig -> origin/gh/anshul-si/19/orig 2025-09-07T07:51:35.9233948Z * [new branch] gh/anshul-si/2/base -> origin/gh/anshul-si/2/base 2025-09-07T07:51:35.9235734Z * [new branch] gh/anshul-si/2/head -> origin/gh/anshul-si/2/head 2025-09-07T07:51:35.9238360Z * [new branch] gh/anshul-si/20/base -> origin/gh/anshul-si/20/base 2025-09-07T07:51:35.9240002Z * [new branch] gh/anshul-si/20/head -> origin/gh/anshul-si/20/head 2025-09-07T07:51:35.9241558Z * [new branch] gh/anshul-si/20/orig -> origin/gh/anshul-si/20/orig 2025-09-07T07:51:35.9243797Z * [new branch] gh/anshul-si/21/base -> origin/gh/anshul-si/21/base 2025-09-07T07:51:35.9245575Z * [new branch] gh/anshul-si/21/head -> origin/gh/anshul-si/21/head 2025-09-07T07:51:35.9247279Z * [new branch] gh/anshul-si/21/orig -> origin/gh/anshul-si/21/orig 2025-09-07T07:51:35.9249575Z * [new branch] gh/anshul-si/22/base -> origin/gh/anshul-si/22/base 2025-09-07T07:51:35.9251199Z * [new branch] gh/anshul-si/22/head -> origin/gh/anshul-si/22/head 2025-09-07T07:51:35.9252759Z * [new branch] gh/anshul-si/22/orig -> origin/gh/anshul-si/22/orig 2025-09-07T07:51:35.9254907Z * [new branch] gh/anshul-si/23/base -> origin/gh/anshul-si/23/base 2025-09-07T07:51:35.9256904Z * [new branch] gh/anshul-si/23/head -> origin/gh/anshul-si/23/head 2025-09-07T07:51:35.9258416Z * [new branch] gh/anshul-si/23/orig -> origin/gh/anshul-si/23/orig 2025-09-07T07:51:35.9260751Z * [new branch] gh/anshul-si/24/base -> origin/gh/anshul-si/24/base 2025-09-07T07:51:35.9262648Z * [new branch] gh/anshul-si/24/head -> origin/gh/anshul-si/24/head 2025-09-07T07:51:35.9264122Z * [new branch] gh/anshul-si/24/orig -> origin/gh/anshul-si/24/orig 2025-09-07T07:51:35.9266848Z * [new branch] gh/anshul-si/25/base -> origin/gh/anshul-si/25/base 2025-09-07T07:51:35.9268664Z * [new branch] gh/anshul-si/25/head -> origin/gh/anshul-si/25/head 2025-09-07T07:51:35.9270068Z * [new branch] gh/anshul-si/25/orig -> origin/gh/anshul-si/25/orig 2025-09-07T07:51:35.9272287Z * [new branch] gh/anshul-si/26/base -> origin/gh/anshul-si/26/base 2025-09-07T07:51:35.9273876Z * [new branch] gh/anshul-si/26/head -> origin/gh/anshul-si/26/head 2025-09-07T07:51:35.9275663Z * [new branch] gh/anshul-si/26/orig -> origin/gh/anshul-si/26/orig 2025-09-07T07:51:35.9278064Z * [new branch] gh/anshul-si/27/base -> origin/gh/anshul-si/27/base 2025-09-07T07:51:35.9279733Z * [new branch] gh/anshul-si/27/head -> origin/gh/anshul-si/27/head 2025-09-07T07:51:35.9281241Z * [new branch] gh/anshul-si/27/orig -> origin/gh/anshul-si/27/orig 2025-09-07T07:51:35.9283377Z * [new branch] gh/anshul-si/28/base -> origin/gh/anshul-si/28/base 2025-09-07T07:51:35.9285088Z * [new branch] gh/anshul-si/28/head -> origin/gh/anshul-si/28/head 2025-09-07T07:51:35.9286847Z * [new branch] gh/anshul-si/28/orig -> origin/gh/anshul-si/28/orig 2025-09-07T07:51:35.9288895Z * [new branch] gh/anshul-si/29/base -> origin/gh/anshul-si/29/base 2025-09-07T07:51:35.9290830Z * [new branch] gh/anshul-si/29/head -> origin/gh/anshul-si/29/head 2025-09-07T07:51:35.9292304Z * [new branch] gh/anshul-si/29/orig -> origin/gh/anshul-si/29/orig 2025-09-07T07:51:35.9294452Z * [new branch] gh/anshul-si/3/base -> origin/gh/anshul-si/3/base 2025-09-07T07:51:35.9296356Z * [new branch] gh/anshul-si/3/head -> origin/gh/anshul-si/3/head 2025-09-07T07:51:35.9298473Z * [new branch] gh/anshul-si/4/base -> origin/gh/anshul-si/4/base 2025-09-07T07:51:35.9299926Z * [new branch] gh/anshul-si/4/head -> origin/gh/anshul-si/4/head 2025-09-07T07:51:35.9302258Z * [new branch] gh/anshul-si/5/base -> origin/gh/anshul-si/5/base 2025-09-07T07:51:35.9303874Z * [new branch] gh/anshul-si/5/head -> origin/gh/anshul-si/5/head 2025-09-07T07:51:35.9307079Z * [new branch] gh/aorenste/132/base -> origin/gh/aorenste/132/base 2025-09-07T07:51:35.9308747Z * [new branch] gh/aorenste/132/head -> origin/gh/aorenste/132/head 2025-09-07T07:51:35.9311550Z * [new branch] gh/bdhirsh/650/base -> origin/gh/bdhirsh/650/base 2025-09-07T07:51:35.9313442Z * [new branch] gh/bdhirsh/650/head -> origin/gh/bdhirsh/650/head 2025-09-07T07:51:35.9315182Z * [new branch] gh/bdhirsh/650/orig -> origin/gh/bdhirsh/650/orig 2025-09-07T07:51:35.9317679Z * [new branch] gh/bdhirsh/663/base -> origin/gh/bdhirsh/663/base 2025-09-07T07:51:35.9319200Z * [new branch] gh/bdhirsh/663/head -> origin/gh/bdhirsh/663/head 2025-09-07T07:51:35.9320780Z * [new branch] gh/bdhirsh/663/orig -> origin/gh/bdhirsh/663/orig 2025-09-07T07:51:35.9323211Z * [new branch] gh/bdhirsh/665/base -> origin/gh/bdhirsh/665/base 2025-09-07T07:51:35.9324836Z * [new branch] gh/bdhirsh/665/head -> origin/gh/bdhirsh/665/head 2025-09-07T07:51:35.9326955Z * [new branch] gh/bdhirsh/665/orig -> origin/gh/bdhirsh/665/orig 2025-09-07T07:51:35.9329436Z * [new branch] gh/bdhirsh/666/base -> origin/gh/bdhirsh/666/base 2025-09-07T07:51:35.9331086Z * [new branch] gh/bdhirsh/666/head -> origin/gh/bdhirsh/666/head 2025-09-07T07:51:35.9332790Z * [new branch] gh/bdhirsh/666/orig -> origin/gh/bdhirsh/666/orig 2025-09-07T07:51:35.9335646Z * [new branch] gh/bdhirsh/667/base -> origin/gh/bdhirsh/667/base 2025-09-07T07:51:35.9337551Z * [new branch] gh/bdhirsh/667/head -> origin/gh/bdhirsh/667/head 2025-09-07T07:51:35.9339008Z * [new branch] gh/bdhirsh/667/orig -> origin/gh/bdhirsh/667/orig 2025-09-07T07:51:35.9341197Z * [new branch] gh/bdhirsh/668/base -> origin/gh/bdhirsh/668/base 2025-09-07T07:51:35.9343009Z * [new branch] gh/bdhirsh/668/head -> origin/gh/bdhirsh/668/head 2025-09-07T07:51:35.9344532Z * [new branch] gh/bdhirsh/668/orig -> origin/gh/bdhirsh/668/orig 2025-09-07T07:51:35.9347219Z * [new branch] gh/bdhirsh/669/base -> origin/gh/bdhirsh/669/base 2025-09-07T07:51:35.9348766Z * [new branch] gh/bdhirsh/669/head -> origin/gh/bdhirsh/669/head 2025-09-07T07:51:35.9350336Z * [new branch] gh/bdhirsh/669/orig -> origin/gh/bdhirsh/669/orig 2025-09-07T07:51:35.9352850Z * [new branch] gh/bdhirsh/670/base -> origin/gh/bdhirsh/670/base 2025-09-07T07:51:35.9354588Z * [new branch] gh/bdhirsh/670/head -> origin/gh/bdhirsh/670/head 2025-09-07T07:51:35.9356501Z * [new branch] gh/bdhirsh/670/orig -> origin/gh/bdhirsh/670/orig 2025-09-07T07:51:35.9359299Z * [new branch] gh/benjaminglass1/100/base -> origin/gh/benjaminglass1/100/base 2025-09-07T07:51:35.9360910Z * [new branch] gh/benjaminglass1/100/head -> origin/gh/benjaminglass1/100/head 2025-09-07T07:51:35.9362570Z * [new branch] gh/benjaminglass1/100/orig -> origin/gh/benjaminglass1/100/orig 2025-09-07T07:51:35.9364894Z * [new branch] gh/benjaminglass1/101/base -> origin/gh/benjaminglass1/101/base 2025-09-07T07:51:35.9366782Z * [new branch] gh/benjaminglass1/101/head -> origin/gh/benjaminglass1/101/head 2025-09-07T07:51:35.9368491Z * [new branch] gh/benjaminglass1/101/orig -> origin/gh/benjaminglass1/101/orig 2025-09-07T07:51:35.9370676Z * [new branch] gh/benjaminglass1/102/base -> origin/gh/benjaminglass1/102/base 2025-09-07T07:51:35.9372343Z * [new branch] gh/benjaminglass1/102/head -> origin/gh/benjaminglass1/102/head 2025-09-07T07:51:35.9373928Z * [new branch] gh/benjaminglass1/102/orig -> origin/gh/benjaminglass1/102/orig 2025-09-07T07:51:35.9376408Z * [new branch] gh/benjaminglass1/103/base -> origin/gh/benjaminglass1/103/base 2025-09-07T07:51:35.9378030Z * [new branch] gh/benjaminglass1/103/head -> origin/gh/benjaminglass1/103/head 2025-09-07T07:51:35.9379568Z * [new branch] gh/benjaminglass1/103/orig -> origin/gh/benjaminglass1/103/orig 2025-09-07T07:51:35.9382012Z * [new branch] gh/benjaminglass1/104/base -> origin/gh/benjaminglass1/104/base 2025-09-07T07:51:35.9383663Z * [new branch] gh/benjaminglass1/104/head -> origin/gh/benjaminglass1/104/head 2025-09-07T07:51:35.9385299Z * [new branch] gh/benjaminglass1/104/orig -> origin/gh/benjaminglass1/104/orig 2025-09-07T07:51:35.9387664Z * [new branch] gh/benjaminglass1/105/base -> origin/gh/benjaminglass1/105/base 2025-09-07T07:51:35.9389289Z * [new branch] gh/benjaminglass1/105/head -> origin/gh/benjaminglass1/105/head 2025-09-07T07:51:35.9390868Z * [new branch] gh/benjaminglass1/105/orig -> origin/gh/benjaminglass1/105/orig 2025-09-07T07:51:35.9393066Z * [new branch] gh/benjaminglass1/106/base -> origin/gh/benjaminglass1/106/base 2025-09-07T07:51:35.9394674Z * [new branch] gh/benjaminglass1/106/head -> origin/gh/benjaminglass1/106/head 2025-09-07T07:51:35.9396615Z * [new branch] gh/benjaminglass1/106/orig -> origin/gh/benjaminglass1/106/orig 2025-09-07T07:51:35.9398876Z * [new branch] gh/benjaminglass1/79/base -> origin/gh/benjaminglass1/79/base 2025-09-07T07:51:35.9400492Z * [new branch] gh/benjaminglass1/79/head -> origin/gh/benjaminglass1/79/head 2025-09-07T07:51:35.9402196Z * [new branch] gh/benjaminglass1/79/orig -> origin/gh/benjaminglass1/79/orig 2025-09-07T07:51:35.9404393Z * [new branch] gh/benjaminglass1/86/base -> origin/gh/benjaminglass1/86/base 2025-09-07T07:51:35.9406263Z * [new branch] gh/benjaminglass1/86/head -> origin/gh/benjaminglass1/86/head 2025-09-07T07:51:35.9407823Z * [new branch] gh/benjaminglass1/86/orig -> origin/gh/benjaminglass1/86/orig 2025-09-07T07:51:35.9410212Z * [new branch] gh/benjaminglass1/89/base -> origin/gh/benjaminglass1/89/base 2025-09-07T07:51:35.9411793Z * [new branch] gh/benjaminglass1/89/head -> origin/gh/benjaminglass1/89/head 2025-09-07T07:51:35.9413376Z * [new branch] gh/benjaminglass1/89/orig -> origin/gh/benjaminglass1/89/orig 2025-09-07T07:51:35.9415926Z * [new branch] gh/benjaminglass1/91/base -> origin/gh/benjaminglass1/91/base 2025-09-07T07:51:35.9417563Z * [new branch] gh/benjaminglass1/91/head -> origin/gh/benjaminglass1/91/head 2025-09-07T07:51:35.9419129Z * [new branch] gh/benjaminglass1/91/orig -> origin/gh/benjaminglass1/91/orig 2025-09-07T07:51:35.9421520Z * [new branch] gh/benjaminglass1/93/base -> origin/gh/benjaminglass1/93/base 2025-09-07T07:51:35.9423156Z * [new branch] gh/benjaminglass1/93/head -> origin/gh/benjaminglass1/93/head 2025-09-07T07:51:35.9424838Z * [new branch] gh/benjaminglass1/93/orig -> origin/gh/benjaminglass1/93/orig 2025-09-07T07:51:35.9427325Z * [new branch] gh/benjaminglass1/95/base -> origin/gh/benjaminglass1/95/base 2025-09-07T07:51:35.9428971Z * [new branch] gh/benjaminglass1/95/head -> origin/gh/benjaminglass1/95/head 2025-09-07T07:51:35.9430724Z * [new branch] gh/benjaminglass1/95/orig -> origin/gh/benjaminglass1/95/orig 2025-09-07T07:51:35.9433062Z * [new branch] gh/benjaminglass1/97/base -> origin/gh/benjaminglass1/97/base 2025-09-07T07:51:35.9434697Z * [new branch] gh/benjaminglass1/97/head -> origin/gh/benjaminglass1/97/head 2025-09-07T07:51:35.9437618Z * [new branch] gh/benjaminglass1/97/orig -> origin/gh/benjaminglass1/97/orig 2025-09-07T07:51:35.9438915Z * [new branch] gh/benjaminglass1/99/base -> origin/gh/benjaminglass1/99/base 2025-09-07T07:51:35.9440347Z * [new branch] gh/benjaminglass1/99/head -> origin/gh/benjaminglass1/99/head 2025-09-07T07:51:35.9441975Z * [new branch] gh/benjaminglass1/99/orig -> origin/gh/benjaminglass1/99/orig 2025-09-07T07:51:35.9444724Z * [new branch] gh/bobrenjc93/514/base -> origin/gh/bobrenjc93/514/base 2025-09-07T07:51:35.9446769Z * [new branch] gh/bobrenjc93/514/head -> origin/gh/bobrenjc93/514/head 2025-09-07T07:51:35.9448370Z * [new branch] gh/bobrenjc93/514/orig -> origin/gh/bobrenjc93/514/orig 2025-09-07T07:51:35.9450637Z * [new branch] gh/bobrenjc93/521/base -> origin/gh/bobrenjc93/521/base 2025-09-07T07:51:35.9452214Z * [new branch] gh/bobrenjc93/521/head -> origin/gh/bobrenjc93/521/head 2025-09-07T07:51:35.9453704Z * [new branch] gh/bobrenjc93/521/orig -> origin/gh/bobrenjc93/521/orig 2025-09-07T07:51:35.9456320Z * [new branch] gh/bobrenjc93/522/base -> origin/gh/bobrenjc93/522/base 2025-09-07T07:51:35.9457834Z * [new branch] gh/bobrenjc93/522/head -> origin/gh/bobrenjc93/522/head 2025-09-07T07:51:35.9459374Z * [new branch] gh/bobrenjc93/522/orig -> origin/gh/bobrenjc93/522/orig 2025-09-07T07:51:35.9461672Z * [new branch] gh/bobrenjc93/525/base -> origin/gh/bobrenjc93/525/base 2025-09-07T07:51:35.9463318Z * [new branch] gh/bobrenjc93/525/head -> origin/gh/bobrenjc93/525/head 2025-09-07T07:51:35.9464818Z * [new branch] gh/bobrenjc93/525/orig -> origin/gh/bobrenjc93/525/orig 2025-09-07T07:51:35.9467628Z * [new branch] gh/bobrenjc93/526/base -> origin/gh/bobrenjc93/526/base 2025-09-07T07:51:35.9469044Z * [new branch] gh/bobrenjc93/526/head -> origin/gh/bobrenjc93/526/head 2025-09-07T07:51:35.9470542Z * [new branch] gh/bobrenjc93/526/orig -> origin/gh/bobrenjc93/526/orig 2025-09-07T07:51:35.9472797Z * [new branch] gh/bobrenjc93/527/base -> origin/gh/bobrenjc93/527/base 2025-09-07T07:51:35.9474396Z * [new branch] gh/bobrenjc93/527/head -> origin/gh/bobrenjc93/527/head 2025-09-07T07:51:35.9476324Z * [new branch] gh/bobrenjc93/527/orig -> origin/gh/bobrenjc93/527/orig 2025-09-07T07:51:35.9478532Z * [new branch] gh/bobrenjc93/528/base -> origin/gh/bobrenjc93/528/base 2025-09-07T07:51:35.9480661Z * [new branch] gh/bobrenjc93/528/head -> origin/gh/bobrenjc93/528/head 2025-09-07T07:51:35.9482238Z * [new branch] gh/bobrenjc93/528/orig -> origin/gh/bobrenjc93/528/orig 2025-09-07T07:51:35.9484436Z * [new branch] gh/bobrenjc93/529/base -> origin/gh/bobrenjc93/529/base 2025-09-07T07:51:35.9486383Z * [new branch] gh/bobrenjc93/529/head -> origin/gh/bobrenjc93/529/head 2025-09-07T07:51:35.9487950Z * [new branch] gh/bobrenjc93/529/orig -> origin/gh/bobrenjc93/529/orig 2025-09-07T07:51:35.9490204Z * [new branch] gh/bobrenjc93/535/base -> origin/gh/bobrenjc93/535/base 2025-09-07T07:51:35.9491738Z * [new branch] gh/bobrenjc93/535/head -> origin/gh/bobrenjc93/535/head 2025-09-07T07:51:35.9493292Z * [new branch] gh/bobrenjc93/535/orig -> origin/gh/bobrenjc93/535/orig 2025-09-07T07:51:35.9495863Z * [new branch] gh/bobrenjc93/537/base -> origin/gh/bobrenjc93/537/base 2025-09-07T07:51:35.9497576Z * [new branch] gh/bobrenjc93/537/head -> origin/gh/bobrenjc93/537/head 2025-09-07T07:51:35.9499122Z * [new branch] gh/bobrenjc93/537/orig -> origin/gh/bobrenjc93/537/orig 2025-09-07T07:51:35.9501680Z * [new branch] gh/bobrenjc93/539/base -> origin/gh/bobrenjc93/539/base 2025-09-07T07:51:35.9503392Z * [new branch] gh/bobrenjc93/539/head -> origin/gh/bobrenjc93/539/head 2025-09-07T07:51:35.9505155Z * [new branch] gh/bobrenjc93/539/orig -> origin/gh/bobrenjc93/539/orig 2025-09-07T07:51:35.9507623Z * [new branch] gh/bobrenjc93/540/base -> origin/gh/bobrenjc93/540/base 2025-09-07T07:51:35.9509235Z * [new branch] gh/bobrenjc93/540/head -> origin/gh/bobrenjc93/540/head 2025-09-07T07:51:35.9510801Z * [new branch] gh/bobrenjc93/540/orig -> origin/gh/bobrenjc93/540/orig 2025-09-07T07:51:35.9513063Z * [new branch] gh/bobrenjc93/541/base -> origin/gh/bobrenjc93/541/base 2025-09-07T07:51:35.9514704Z * [new branch] gh/bobrenjc93/541/head -> origin/gh/bobrenjc93/541/head 2025-09-07T07:51:35.9516527Z * [new branch] gh/bobrenjc93/541/orig -> origin/gh/bobrenjc93/541/orig 2025-09-07T07:51:35.9518777Z * [new branch] gh/bobrenjc93/542/base -> origin/gh/bobrenjc93/542/base 2025-09-07T07:51:35.9520358Z * [new branch] gh/bobrenjc93/542/head -> origin/gh/bobrenjc93/542/head 2025-09-07T07:51:35.9521935Z * [new branch] gh/bobrenjc93/542/orig -> origin/gh/bobrenjc93/542/orig 2025-09-07T07:51:35.9524191Z * [new branch] gh/bobrenjc93/543/base -> origin/gh/bobrenjc93/543/base 2025-09-07T07:51:35.9526266Z * [new branch] gh/bobrenjc93/543/head -> origin/gh/bobrenjc93/543/head 2025-09-07T07:51:35.9527835Z * [new branch] gh/bobrenjc93/543/orig -> origin/gh/bobrenjc93/543/orig 2025-09-07T07:51:35.9529938Z * [new branch] gh/bobrenjc93/544/base -> origin/gh/bobrenjc93/544/base 2025-09-07T07:51:35.9531881Z * [new branch] gh/bobrenjc93/544/head -> origin/gh/bobrenjc93/544/head 2025-09-07T07:51:35.9533280Z * [new branch] gh/bobrenjc93/544/orig -> origin/gh/bobrenjc93/544/orig 2025-09-07T07:51:35.9535575Z * [new branch] gh/bobrenjc93/545/base -> origin/gh/bobrenjc93/545/base 2025-09-07T07:51:35.9537291Z * [new branch] gh/bobrenjc93/545/head -> origin/gh/bobrenjc93/545/head 2025-09-07T07:51:35.9538896Z * [new branch] gh/bobrenjc93/545/orig -> origin/gh/bobrenjc93/545/orig 2025-09-07T07:51:35.9541246Z * [new branch] gh/bobrenjc93/546/base -> origin/gh/bobrenjc93/546/base 2025-09-07T07:51:35.9543006Z * [new branch] gh/bobrenjc93/546/head -> origin/gh/bobrenjc93/546/head 2025-09-07T07:51:35.9544548Z * [new branch] gh/bobrenjc93/546/orig -> origin/gh/bobrenjc93/546/orig 2025-09-07T07:51:35.9548032Z * [new branch] gh/bobrenjc93/547/base -> origin/gh/bobrenjc93/547/base 2025-09-07T07:51:35.9549607Z * [new branch] gh/bobrenjc93/547/head -> origin/gh/bobrenjc93/547/head 2025-09-07T07:51:35.9551217Z * [new branch] gh/bobrenjc93/547/orig -> origin/gh/bobrenjc93/547/orig 2025-09-07T07:51:35.9553457Z * [new branch] gh/bobrenjc93/548/base -> origin/gh/bobrenjc93/548/base 2025-09-07T07:51:35.9555055Z * [new branch] gh/bobrenjc93/548/head -> origin/gh/bobrenjc93/548/head 2025-09-07T07:51:35.9556856Z * [new branch] gh/bobrenjc93/548/orig -> origin/gh/bobrenjc93/548/orig 2025-09-07T07:51:35.9558913Z * [new branch] gh/bobrenjc93/549/base -> origin/gh/bobrenjc93/549/base 2025-09-07T07:51:35.9560544Z * [new branch] gh/bobrenjc93/549/head -> origin/gh/bobrenjc93/549/head 2025-09-07T07:51:35.9562105Z * [new branch] gh/bobrenjc93/549/orig -> origin/gh/bobrenjc93/549/orig 2025-09-07T07:51:35.9564568Z * [new branch] gh/bobrenjc93/550/base -> origin/gh/bobrenjc93/550/base 2025-09-07T07:51:35.9566444Z * [new branch] gh/bobrenjc93/550/head -> origin/gh/bobrenjc93/550/head 2025-09-07T07:51:35.9568040Z * [new branch] gh/bobrenjc93/550/orig -> origin/gh/bobrenjc93/550/orig 2025-09-07T07:51:35.9570508Z * [new branch] gh/bobrenjc93/551/base -> origin/gh/bobrenjc93/551/base 2025-09-07T07:51:35.9572164Z * [new branch] gh/bobrenjc93/551/head -> origin/gh/bobrenjc93/551/head 2025-09-07T07:51:35.9573935Z * [new branch] gh/bobrenjc93/551/orig -> origin/gh/bobrenjc93/551/orig 2025-09-07T07:51:35.9576369Z * [new branch] gh/bobrenjc93/552/base -> origin/gh/bobrenjc93/552/base 2025-09-07T07:51:35.9578026Z * [new branch] gh/bobrenjc93/552/head -> origin/gh/bobrenjc93/552/head 2025-09-07T07:51:35.9579601Z * [new branch] gh/bobrenjc93/552/orig -> origin/gh/bobrenjc93/552/orig 2025-09-07T07:51:35.9581889Z * [new branch] gh/bobrenjc93/553/base -> origin/gh/bobrenjc93/553/base 2025-09-07T07:51:35.9583570Z * [new branch] gh/bobrenjc93/553/head -> origin/gh/bobrenjc93/553/head 2025-09-07T07:51:35.9585243Z * [new branch] gh/bobrenjc93/553/orig -> origin/gh/bobrenjc93/553/orig 2025-09-07T07:51:35.9587583Z * [new branch] gh/bobrenjc93/554/base -> origin/gh/bobrenjc93/554/base 2025-09-07T07:51:35.9589182Z * [new branch] gh/bobrenjc93/554/head -> origin/gh/bobrenjc93/554/head 2025-09-07T07:51:35.9590741Z * [new branch] gh/bobrenjc93/554/orig -> origin/gh/bobrenjc93/554/orig 2025-09-07T07:51:35.9593321Z * [new branch] gh/bobrenjc93/555/base -> origin/gh/bobrenjc93/555/base 2025-09-07T07:51:35.9594621Z * [new branch] gh/bobrenjc93/555/head -> origin/gh/bobrenjc93/555/head 2025-09-07T07:51:35.9596421Z * [new branch] gh/bobrenjc93/555/orig -> origin/gh/bobrenjc93/555/orig 2025-09-07T07:51:35.9598818Z * [new branch] gh/bobrenjc93/556/base -> origin/gh/bobrenjc93/556/base 2025-09-07T07:51:35.9600310Z * [new branch] gh/bobrenjc93/556/head -> origin/gh/bobrenjc93/556/head 2025-09-07T07:51:35.9601955Z * [new branch] gh/bobrenjc93/556/orig -> origin/gh/bobrenjc93/556/orig 2025-09-07T07:51:35.9604749Z * [new branch] gh/briancoutinho/2/base -> origin/gh/briancoutinho/2/base 2025-09-07T07:51:35.9606693Z * [new branch] gh/briancoutinho/2/head -> origin/gh/briancoutinho/2/head 2025-09-07T07:51:35.9609527Z * [new branch] gh/c00w/23/base -> origin/gh/c00w/23/base 2025-09-07T07:51:35.9611131Z * [new branch] gh/c00w/23/head -> origin/gh/c00w/23/head 2025-09-07T07:51:35.9613382Z * [new branch] gh/c00w/48/base -> origin/gh/c00w/48/base 2025-09-07T07:51:35.9615056Z * [new branch] gh/c00w/48/head -> origin/gh/c00w/48/head 2025-09-07T07:51:35.9616889Z * [new branch] gh/c00w/48/orig -> origin/gh/c00w/48/orig 2025-09-07T07:51:35.9619242Z * [new branch] gh/c00w/53/base -> origin/gh/c00w/53/base 2025-09-07T07:51:35.9620745Z * [new branch] gh/c00w/53/head -> origin/gh/c00w/53/head 2025-09-07T07:51:35.9622655Z * [new branch] gh/c00w/53/orig -> origin/gh/c00w/53/orig 2025-09-07T07:51:35.9624707Z * [new branch] gh/c00w/54/base -> origin/gh/c00w/54/base 2025-09-07T07:51:35.9626816Z * [new branch] gh/c00w/54/head -> origin/gh/c00w/54/head 2025-09-07T07:51:35.9628530Z * [new branch] gh/c00w/54/orig -> origin/gh/c00w/54/orig 2025-09-07T07:51:35.9630650Z * [new branch] gh/c00w/55/base -> origin/gh/c00w/55/base 2025-09-07T07:51:35.9632419Z * [new branch] gh/c00w/55/head -> origin/gh/c00w/55/head 2025-09-07T07:51:35.9633995Z * [new branch] gh/c00w/55/orig -> origin/gh/c00w/55/orig 2025-09-07T07:51:35.9636437Z * [new branch] gh/c00w/56/base -> origin/gh/c00w/56/base 2025-09-07T07:51:35.9638125Z * [new branch] gh/c00w/56/head -> origin/gh/c00w/56/head 2025-09-07T07:51:35.9639752Z * [new branch] gh/c00w/56/orig -> origin/gh/c00w/56/orig 2025-09-07T07:51:35.9642468Z * [new branch] gh/clee2000/1/base -> origin/gh/clee2000/1/base 2025-09-07T07:51:35.9644212Z * [new branch] gh/clee2000/1/head -> origin/gh/clee2000/1/head 2025-09-07T07:51:35.9646176Z * [new branch] gh/clee2000/1/orig -> origin/gh/clee2000/1/orig 2025-09-07T07:51:35.9649025Z * [new branch] gh/coconutruben/1/base -> origin/gh/coconutruben/1/base 2025-09-07T07:51:35.9650742Z * [new branch] gh/coconutruben/1/head -> origin/gh/coconutruben/1/head 2025-09-07T07:51:35.9653128Z * [new branch] gh/coconutruben/11/base -> origin/gh/coconutruben/11/base 2025-09-07T07:51:35.9654849Z * [new branch] gh/coconutruben/11/head -> origin/gh/coconutruben/11/head 2025-09-07T07:51:35.9656774Z * [new branch] gh/coconutruben/11/orig -> origin/gh/coconutruben/11/orig 2025-09-07T07:51:35.9659532Z * [new branch] gh/coconutruben/12/base -> origin/gh/coconutruben/12/base 2025-09-07T07:51:35.9661392Z * [new branch] gh/coconutruben/12/head -> origin/gh/coconutruben/12/head 2025-09-07T07:51:35.9663428Z * [new branch] gh/coconutruben/12/orig -> origin/gh/coconutruben/12/orig 2025-09-07T07:51:35.9666115Z * [new branch] gh/coconutruben/13/base -> origin/gh/coconutruben/13/base 2025-09-07T07:51:35.9667792Z * [new branch] gh/coconutruben/13/head -> origin/gh/coconutruben/13/head 2025-09-07T07:51:35.9669603Z * [new branch] gh/coconutruben/13/orig -> origin/gh/coconutruben/13/orig 2025-09-07T07:51:35.9671832Z * [new branch] gh/coconutruben/14/base -> origin/gh/coconutruben/14/base 2025-09-07T07:51:35.9673440Z * [new branch] gh/coconutruben/14/head -> origin/gh/coconutruben/14/head 2025-09-07T07:51:35.9675153Z * [new branch] gh/coconutruben/14/orig -> origin/gh/coconutruben/14/orig 2025-09-07T07:51:35.9677901Z * [new branch] gh/coconutruben/15/base -> origin/gh/coconutruben/15/base 2025-09-07T07:51:35.9679602Z * [new branch] gh/coconutruben/15/head -> origin/gh/coconutruben/15/head 2025-09-07T07:51:35.9681308Z * [new branch] gh/coconutruben/15/orig -> origin/gh/coconutruben/15/orig 2025-09-07T07:51:35.9683602Z * [new branch] gh/coconutruben/16/base -> origin/gh/coconutruben/16/base 2025-09-07T07:51:35.9685306Z * [new branch] gh/coconutruben/16/head -> origin/gh/coconutruben/16/head 2025-09-07T07:51:35.9687064Z * [new branch] gh/coconutruben/16/orig -> origin/gh/coconutruben/16/orig 2025-09-07T07:51:35.9689566Z * [new branch] gh/coconutruben/17/base -> origin/gh/coconutruben/17/base 2025-09-07T07:51:35.9691299Z * [new branch] gh/coconutruben/17/head -> origin/gh/coconutruben/17/head 2025-09-07T07:51:35.9692960Z * [new branch] gh/coconutruben/17/orig -> origin/gh/coconutruben/17/orig 2025-09-07T07:51:35.9695710Z * [new branch] gh/coconutruben/18/base -> origin/gh/coconutruben/18/base 2025-09-07T07:51:35.9697364Z * [new branch] gh/coconutruben/18/head -> origin/gh/coconutruben/18/head 2025-09-07T07:51:35.9699009Z * [new branch] gh/coconutruben/18/orig -> origin/gh/coconutruben/18/orig 2025-09-07T07:51:35.9701338Z * [new branch] gh/coconutruben/19/base -> origin/gh/coconutruben/19/base 2025-09-07T07:51:35.9703355Z * [new branch] gh/coconutruben/19/head -> origin/gh/coconutruben/19/head 2025-09-07T07:51:35.9704891Z * [new branch] gh/coconutruben/19/orig -> origin/gh/coconutruben/19/orig 2025-09-07T07:51:35.9707587Z * [new branch] gh/coconutruben/20/base -> origin/gh/coconutruben/20/base 2025-09-07T07:51:35.9709245Z * [new branch] gh/coconutruben/20/head -> origin/gh/coconutruben/20/head 2025-09-07T07:51:35.9710930Z * [new branch] gh/coconutruben/20/orig -> origin/gh/coconutruben/20/orig 2025-09-07T07:51:35.9713243Z * [new branch] gh/coconutruben/21/base -> origin/gh/coconutruben/21/base 2025-09-07T07:51:35.9714805Z * [new branch] gh/coconutruben/21/head -> origin/gh/coconutruben/21/head 2025-09-07T07:51:35.9716703Z * [new branch] gh/coconutruben/21/orig -> origin/gh/coconutruben/21/orig 2025-09-07T07:51:35.9719027Z * [new branch] gh/coconutruben/22/base -> origin/gh/coconutruben/22/base 2025-09-07T07:51:35.9720747Z * [new branch] gh/coconutruben/22/head -> origin/gh/coconutruben/22/head 2025-09-07T07:51:35.9722682Z * [new branch] gh/coconutruben/22/orig -> origin/gh/coconutruben/22/orig 2025-09-07T07:51:35.9725089Z * [new branch] gh/coconutruben/24/base -> origin/gh/coconutruben/24/base 2025-09-07T07:51:35.9726989Z * [new branch] gh/coconutruben/24/head -> origin/gh/coconutruben/24/head 2025-09-07T07:51:35.9728614Z * [new branch] gh/coconutruben/24/orig -> origin/gh/coconutruben/24/orig 2025-09-07T07:51:35.9731236Z * [new branch] gh/coconutruben/25/base -> origin/gh/coconutruben/25/base 2025-09-07T07:51:35.9733151Z * [new branch] gh/coconutruben/25/head -> origin/gh/coconutruben/25/head 2025-09-07T07:51:35.9735130Z * [new branch] gh/coconutruben/25/orig -> origin/gh/coconutruben/25/orig 2025-09-07T07:51:35.9737978Z * [new branch] gh/coconutruben/28/base -> origin/gh/coconutruben/28/base 2025-09-07T07:51:35.9739404Z * [new branch] gh/coconutruben/28/head -> origin/gh/coconutruben/28/head 2025-09-07T07:51:35.9740983Z * [new branch] gh/coconutruben/28/orig -> origin/gh/coconutruben/28/orig 2025-09-07T07:51:35.9743522Z * [new branch] gh/coconutruben/29/base -> origin/gh/coconutruben/29/base 2025-09-07T07:51:35.9745326Z * [new branch] gh/coconutruben/29/head -> origin/gh/coconutruben/29/head 2025-09-07T07:51:35.9747135Z * [new branch] gh/coconutruben/29/orig -> origin/gh/coconutruben/29/orig 2025-09-07T07:51:35.9749620Z * [new branch] gh/coconutruben/30/base -> origin/gh/coconutruben/30/base 2025-09-07T07:51:35.9751163Z * [new branch] gh/coconutruben/30/head -> origin/gh/coconutruben/30/head 2025-09-07T07:51:35.9752815Z * [new branch] gh/coconutruben/30/orig -> origin/gh/coconutruben/30/orig 2025-09-07T07:51:35.9755335Z * [new branch] gh/coconutruben/31/base -> origin/gh/coconutruben/31/base 2025-09-07T07:51:35.9757186Z * [new branch] gh/coconutruben/31/head -> origin/gh/coconutruben/31/head 2025-09-07T07:51:35.9758748Z * [new branch] gh/coconutruben/31/orig -> origin/gh/coconutruben/31/orig 2025-09-07T07:51:35.9761335Z * [new branch] gh/coconutruben/32/base -> origin/gh/coconutruben/32/base 2025-09-07T07:51:35.9763012Z * [new branch] gh/coconutruben/32/head -> origin/gh/coconutruben/32/head 2025-09-07T07:51:35.9764629Z * [new branch] gh/coconutruben/32/orig -> origin/gh/coconutruben/32/orig 2025-09-07T07:51:35.9767494Z * [new branch] gh/coconutruben/33/base -> origin/gh/coconutruben/33/base 2025-09-07T07:51:35.9769144Z * [new branch] gh/coconutruben/33/head -> origin/gh/coconutruben/33/head 2025-09-07T07:51:35.9770758Z * [new branch] gh/coconutruben/33/orig -> origin/gh/coconutruben/33/orig 2025-09-07T07:51:35.9775900Z * [new branch] gh/coconutruben/34/base -> origin/gh/coconutruben/34/base 2025-09-07T07:51:35.9777452Z * [new branch] gh/coconutruben/34/head -> origin/gh/coconutruben/34/head 2025-09-07T07:51:35.9779068Z * [new branch] gh/coconutruben/34/orig -> origin/gh/coconutruben/34/orig 2025-09-07T07:51:35.9781378Z * [new branch] gh/coconutruben/35/base -> origin/gh/coconutruben/35/base 2025-09-07T07:51:35.9783216Z * [new branch] gh/coconutruben/35/head -> origin/gh/coconutruben/35/head 2025-09-07T07:51:35.9784832Z * [new branch] gh/coconutruben/35/orig -> origin/gh/coconutruben/35/orig 2025-09-07T07:51:35.9788797Z * [new branch] gh/coconutruben/36/base -> origin/gh/coconutruben/36/base 2025-09-07T07:51:35.9790811Z * [new branch] gh/coconutruben/36/head -> origin/gh/coconutruben/36/head 2025-09-07T07:51:35.9793098Z * [new branch] gh/coconutruben/36/orig -> origin/gh/coconutruben/36/orig 2025-09-07T07:51:35.9796089Z * [new branch] gh/coconutruben/37/base -> origin/gh/coconutruben/37/base 2025-09-07T07:51:35.9797614Z * [new branch] gh/coconutruben/37/head -> origin/gh/coconutruben/37/head 2025-09-07T07:51:35.9799250Z * [new branch] gh/coconutruben/37/orig -> origin/gh/coconutruben/37/orig 2025-09-07T07:51:35.9801760Z * [new branch] gh/coconutruben/38/base -> origin/gh/coconutruben/38/base 2025-09-07T07:51:35.9803619Z * [new branch] gh/coconutruben/38/head -> origin/gh/coconutruben/38/head 2025-09-07T07:51:35.9805277Z * [new branch] gh/coconutruben/38/orig -> origin/gh/coconutruben/38/orig 2025-09-07T07:51:35.9807939Z * [new branch] gh/coconutruben/39/base -> origin/gh/coconutruben/39/base 2025-09-07T07:51:35.9809629Z * [new branch] gh/coconutruben/39/head -> origin/gh/coconutruben/39/head 2025-09-07T07:51:35.9811071Z * [new branch] gh/coconutruben/39/orig -> origin/gh/coconutruben/39/orig 2025-09-07T07:51:35.9813532Z * [new branch] gh/coconutruben/40/base -> origin/gh/coconutruben/40/base 2025-09-07T07:51:35.9815259Z * [new branch] gh/coconutruben/40/head -> origin/gh/coconutruben/40/head 2025-09-07T07:51:35.9817006Z * [new branch] gh/coconutruben/40/orig -> origin/gh/coconutruben/40/orig 2025-09-07T07:51:35.9819447Z * [new branch] gh/coconutruben/41/base -> origin/gh/coconutruben/41/base 2025-09-07T07:51:35.9821185Z * [new branch] gh/coconutruben/41/head -> origin/gh/coconutruben/41/head 2025-09-07T07:51:35.9822908Z * [new branch] gh/coconutruben/41/orig -> origin/gh/coconutruben/41/orig 2025-09-07T07:51:35.9825568Z * [new branch] gh/coconutruben/42/base -> origin/gh/coconutruben/42/base 2025-09-07T07:51:35.9827582Z * [new branch] gh/coconutruben/42/head -> origin/gh/coconutruben/42/head 2025-09-07T07:51:35.9829201Z * [new branch] gh/coconutruben/42/orig -> origin/gh/coconutruben/42/orig 2025-09-07T07:51:35.9831841Z * [new branch] gh/coconutruben/43/base -> origin/gh/coconutruben/43/base 2025-09-07T07:51:35.9833370Z * [new branch] gh/coconutruben/43/head -> origin/gh/coconutruben/43/head 2025-09-07T07:51:35.9835182Z * [new branch] gh/coconutruben/43/orig -> origin/gh/coconutruben/43/orig 2025-09-07T07:51:35.9837846Z * [new branch] gh/coconutruben/44/base -> origin/gh/coconutruben/44/base 2025-09-07T07:51:35.9839431Z * [new branch] gh/coconutruben/44/head -> origin/gh/coconutruben/44/head 2025-09-07T07:51:35.9841146Z * [new branch] gh/coconutruben/44/orig -> origin/gh/coconutruben/44/orig 2025-09-07T07:51:35.9843725Z * [new branch] gh/coconutruben/45/base -> origin/gh/coconutruben/45/base 2025-09-07T07:51:35.9845709Z * [new branch] gh/coconutruben/45/head -> origin/gh/coconutruben/45/head 2025-09-07T07:51:35.9847416Z * [new branch] gh/coconutruben/45/orig -> origin/gh/coconutruben/45/orig 2025-09-07T07:51:35.9849716Z * [new branch] gh/coconutruben/46/base -> origin/gh/coconutruben/46/base 2025-09-07T07:51:35.9851369Z * [new branch] gh/coconutruben/46/head -> origin/gh/coconutruben/46/head 2025-09-07T07:51:35.9853011Z * [new branch] gh/coconutruben/46/orig -> origin/gh/coconutruben/46/orig 2025-09-07T07:51:35.9855685Z * [new branch] gh/coconutruben/47/base -> origin/gh/coconutruben/47/base 2025-09-07T07:51:35.9857371Z * [new branch] gh/coconutruben/47/head -> origin/gh/coconutruben/47/head 2025-09-07T07:51:35.9859109Z * [new branch] gh/coconutruben/47/orig -> origin/gh/coconutruben/47/orig 2025-09-07T07:51:35.9861714Z * [new branch] gh/coconutruben/48/base -> origin/gh/coconutruben/48/base 2025-09-07T07:51:35.9863481Z * [new branch] gh/coconutruben/48/head -> origin/gh/coconutruben/48/head 2025-09-07T07:51:35.9865178Z * [new branch] gh/coconutruben/48/orig -> origin/gh/coconutruben/48/orig 2025-09-07T07:51:35.9868080Z * [new branch] gh/coconutruben/49/base -> origin/gh/coconutruben/49/base 2025-09-07T07:51:35.9869684Z * [new branch] gh/coconutruben/49/head -> origin/gh/coconutruben/49/head 2025-09-07T07:51:35.9871305Z * [new branch] gh/coconutruben/49/orig -> origin/gh/coconutruben/49/orig 2025-09-07T07:51:35.9873745Z * [new branch] gh/coconutruben/50/base -> origin/gh/coconutruben/50/base 2025-09-07T07:51:35.9875613Z * [new branch] gh/coconutruben/50/head -> origin/gh/coconutruben/50/head 2025-09-07T07:51:35.9877523Z * [new branch] gh/coconutruben/50/orig -> origin/gh/coconutruben/50/orig 2025-09-07T07:51:35.9880023Z * [new branch] gh/coconutruben/51/base -> origin/gh/coconutruben/51/base 2025-09-07T07:51:35.9881555Z * [new branch] gh/coconutruben/51/head -> origin/gh/coconutruben/51/head 2025-09-07T07:51:35.9883232Z * [new branch] gh/coconutruben/51/orig -> origin/gh/coconutruben/51/orig 2025-09-07T07:51:35.9886022Z * [new branch] gh/coconutruben/52/base -> origin/gh/coconutruben/52/base 2025-09-07T07:51:35.9887760Z * [new branch] gh/coconutruben/52/head -> origin/gh/coconutruben/52/head 2025-09-07T07:51:35.9889513Z * [new branch] gh/coconutruben/52/orig -> origin/gh/coconutruben/52/orig 2025-09-07T07:51:35.9892017Z * [new branch] gh/coconutruben/53/base -> origin/gh/coconutruben/53/base 2025-09-07T07:51:35.9893590Z * [new branch] gh/coconutruben/53/head -> origin/gh/coconutruben/53/head 2025-09-07T07:51:35.9895256Z * [new branch] gh/coconutruben/53/orig -> origin/gh/coconutruben/53/orig 2025-09-07T07:51:35.9897945Z * [new branch] gh/coconutruben/54/base -> origin/gh/coconutruben/54/base 2025-09-07T07:51:35.9899514Z * [new branch] gh/coconutruben/54/head -> origin/gh/coconutruben/54/head 2025-09-07T07:51:35.9901124Z * [new branch] gh/coconutruben/54/orig -> origin/gh/coconutruben/54/orig 2025-09-07T07:51:35.9903758Z * [new branch] gh/coconutruben/55/base -> origin/gh/coconutruben/55/base 2025-09-07T07:51:35.9905613Z * [new branch] gh/coconutruben/55/head -> origin/gh/coconutruben/55/head 2025-09-07T07:51:35.9907383Z * [new branch] gh/coconutruben/55/orig -> origin/gh/coconutruben/55/orig 2025-09-07T07:51:35.9909958Z * [new branch] gh/coconutruben/56/base -> origin/gh/coconutruben/56/base 2025-09-07T07:51:35.9911480Z * [new branch] gh/coconutruben/56/head -> origin/gh/coconutruben/56/head 2025-09-07T07:51:35.9913117Z * [new branch] gh/coconutruben/56/orig -> origin/gh/coconutruben/56/orig 2025-09-07T07:51:35.9915809Z * [new branch] gh/coconutruben/57/base -> origin/gh/coconutruben/57/base 2025-09-07T07:51:35.9917501Z * [new branch] gh/coconutruben/57/head -> origin/gh/coconutruben/57/head 2025-09-07T07:51:35.9919205Z * [new branch] gh/coconutruben/57/orig -> origin/gh/coconutruben/57/orig 2025-09-07T07:51:35.9921893Z * [new branch] gh/coconutruben/58/base -> origin/gh/coconutruben/58/base 2025-09-07T07:51:35.9923683Z * [new branch] gh/coconutruben/58/head -> origin/gh/coconutruben/58/head 2025-09-07T07:51:35.9925533Z * [new branch] gh/coconutruben/58/orig -> origin/gh/coconutruben/58/orig 2025-09-07T07:51:35.9928144Z * [new branch] gh/coconutruben/59/base -> origin/gh/coconutruben/59/base 2025-09-07T07:51:35.9929699Z * [new branch] gh/coconutruben/59/head -> origin/gh/coconutruben/59/head 2025-09-07T07:51:35.9931290Z * [new branch] gh/coconutruben/59/orig -> origin/gh/coconutruben/59/orig 2025-09-07T07:51:35.9933759Z * [new branch] gh/coconutruben/60/base -> origin/gh/coconutruben/60/base 2025-09-07T07:51:35.9935554Z * [new branch] gh/coconutruben/60/head -> origin/gh/coconutruben/60/head 2025-09-07T07:51:35.9937267Z * [new branch] gh/coconutruben/60/orig -> origin/gh/coconutruben/60/orig 2025-09-07T07:51:35.9939653Z * [new branch] gh/coconutruben/61/base -> origin/gh/coconutruben/61/base 2025-09-07T07:51:35.9941421Z * [new branch] gh/coconutruben/61/head -> origin/gh/coconutruben/61/head 2025-09-07T07:51:35.9943313Z * [new branch] gh/coconutruben/61/orig -> origin/gh/coconutruben/61/orig 2025-09-07T07:51:35.9946111Z * [new branch] gh/coconutruben/62/base -> origin/gh/coconutruben/62/base 2025-09-07T07:51:35.9947887Z * [new branch] gh/coconutruben/62/head -> origin/gh/coconutruben/62/head 2025-09-07T07:51:35.9949401Z * [new branch] gh/coconutruben/62/orig -> origin/gh/coconutruben/62/orig 2025-09-07T07:51:35.9951866Z * [new branch] gh/coconutruben/63/base -> origin/gh/coconutruben/63/base 2025-09-07T07:51:35.9953575Z * [new branch] gh/coconutruben/63/head -> origin/gh/coconutruben/63/head 2025-09-07T07:51:35.9955623Z * [new branch] gh/coconutruben/63/orig -> origin/gh/coconutruben/63/orig 2025-09-07T07:51:35.9957970Z * [new branch] gh/coconutruben/64/base -> origin/gh/coconutruben/64/base 2025-09-07T07:51:35.9959759Z * [new branch] gh/coconutruben/64/head -> origin/gh/coconutruben/64/head 2025-09-07T07:51:35.9961243Z * [new branch] gh/coconutruben/64/orig -> origin/gh/coconutruben/64/orig 2025-09-07T07:51:35.9963684Z * [new branch] gh/coconutruben/65/base -> origin/gh/coconutruben/65/base 2025-09-07T07:51:35.9965532Z * [new branch] gh/coconutruben/65/head -> origin/gh/coconutruben/65/head 2025-09-07T07:51:35.9967320Z * [new branch] gh/coconutruben/65/orig -> origin/gh/coconutruben/65/orig 2025-09-07T07:51:35.9969644Z * [new branch] gh/coconutruben/66/base -> origin/gh/coconutruben/66/base 2025-09-07T07:51:35.9971230Z * [new branch] gh/coconutruben/66/head -> origin/gh/coconutruben/66/head 2025-09-07T07:51:35.9972832Z * [new branch] gh/coconutruben/66/orig -> origin/gh/coconutruben/66/orig 2025-09-07T07:51:35.9976311Z * [new branch] gh/codingwithsurya/12/base -> origin/gh/codingwithsurya/12/base 2025-09-07T07:51:36.0270693Z * [new branch] gh/codingwithsurya/12/head -> origin/gh/codingwithsurya/12/head 2025-09-07T07:51:36.0272450Z * [new branch] gh/codingwithsurya/12/orig -> origin/gh/codingwithsurya/12/orig 2025-09-07T07:51:36.0274862Z * [new branch] gh/codingwithsurya/14/base -> origin/gh/codingwithsurya/14/base 2025-09-07T07:51:36.0276841Z * [new branch] gh/codingwithsurya/14/head -> origin/gh/codingwithsurya/14/head 2025-09-07T07:51:36.0278378Z * [new branch] gh/codingwithsurya/14/orig -> origin/gh/codingwithsurya/14/orig 2025-09-07T07:51:36.0280844Z * [new branch] gh/codingwithsurya/15/base -> origin/gh/codingwithsurya/15/base 2025-09-07T07:51:36.0282584Z * [new branch] gh/codingwithsurya/15/head -> origin/gh/codingwithsurya/15/head 2025-09-07T07:51:36.0284207Z * [new branch] gh/codingwithsurya/15/orig -> origin/gh/codingwithsurya/15/orig 2025-09-07T07:51:36.0287026Z * [new branch] gh/codingwithsurya/16/base -> origin/gh/codingwithsurya/16/base 2025-09-07T07:51:36.0288715Z * [new branch] gh/codingwithsurya/16/head -> origin/gh/codingwithsurya/16/head 2025-09-07T07:51:36.0290283Z * [new branch] gh/codingwithsurya/16/orig -> origin/gh/codingwithsurya/16/orig 2025-09-07T07:51:36.0292703Z * [new branch] gh/codingwithsurya/17/base -> origin/gh/codingwithsurya/17/base 2025-09-07T07:51:36.0294454Z * [new branch] gh/codingwithsurya/17/head -> origin/gh/codingwithsurya/17/head 2025-09-07T07:51:36.0296520Z * [new branch] gh/codingwithsurya/17/orig -> origin/gh/codingwithsurya/17/orig 2025-09-07T07:51:36.0298830Z * [new branch] gh/codingwithsurya/18/base -> origin/gh/codingwithsurya/18/base 2025-09-07T07:51:36.0300517Z * [new branch] gh/codingwithsurya/18/head -> origin/gh/codingwithsurya/18/head 2025-09-07T07:51:36.0302211Z * [new branch] gh/codingwithsurya/18/orig -> origin/gh/codingwithsurya/18/orig 2025-09-07T07:51:36.0304731Z * [new branch] gh/codingwithsurya/19/base -> origin/gh/codingwithsurya/19/base 2025-09-07T07:51:36.0306961Z * [new branch] gh/codingwithsurya/19/head -> origin/gh/codingwithsurya/19/head 2025-09-07T07:51:36.0308329Z * [new branch] gh/codingwithsurya/19/orig -> origin/gh/codingwithsurya/19/orig 2025-09-07T07:51:36.0310674Z * [new branch] gh/codingwithsurya/20/base -> origin/gh/codingwithsurya/20/base 2025-09-07T07:51:36.0312285Z * [new branch] gh/codingwithsurya/20/head -> origin/gh/codingwithsurya/20/head 2025-09-07T07:51:36.0313894Z * [new branch] gh/codingwithsurya/20/orig -> origin/gh/codingwithsurya/20/orig 2025-09-07T07:51:36.0316737Z * [new branch] gh/codingwithsurya/21/base -> origin/gh/codingwithsurya/21/base 2025-09-07T07:51:36.0318420Z * [new branch] gh/codingwithsurya/21/head -> origin/gh/codingwithsurya/21/head 2025-09-07T07:51:36.0320140Z * [new branch] gh/codingwithsurya/21/orig -> origin/gh/codingwithsurya/21/orig 2025-09-07T07:51:36.0322987Z * [new branch] gh/colinchan15/1/base -> origin/gh/colinchan15/1/base 2025-09-07T07:51:36.0324570Z * [new branch] gh/colinchan15/1/head -> origin/gh/colinchan15/1/head 2025-09-07T07:51:36.0327337Z * [new branch] gh/colinchan15/2/base -> origin/gh/colinchan15/2/base 2025-09-07T07:51:36.0328876Z * [new branch] gh/colinchan15/2/head -> origin/gh/colinchan15/2/head 2025-09-07T07:51:36.0331056Z * [new branch] gh/colinchan15/3/base -> origin/gh/colinchan15/3/base 2025-09-07T07:51:36.0332495Z * [new branch] gh/colinchan15/3/head -> origin/gh/colinchan15/3/head 2025-09-07T07:51:36.0334692Z * [new branch] gh/colinchan15/6/base -> origin/gh/colinchan15/6/base 2025-09-07T07:51:36.0336664Z * [new branch] gh/colinchan15/6/head -> origin/gh/colinchan15/6/head 2025-09-07T07:51:36.0339586Z * [new branch] gh/davidberard98/382/base -> origin/gh/davidberard98/382/base 2025-09-07T07:51:36.0341373Z * [new branch] gh/davidberard98/382/head -> origin/gh/davidberard98/382/head 2025-09-07T07:51:36.0343086Z * [new branch] gh/davidberard98/382/orig -> origin/gh/davidberard98/382/orig 2025-09-07T07:51:36.0345897Z * [new branch] gh/davidberard98/386/base -> origin/gh/davidberard98/386/base 2025-09-07T07:51:36.0347336Z * [new branch] gh/davidberard98/386/head -> origin/gh/davidberard98/386/head 2025-09-07T07:51:36.0348928Z * [new branch] gh/davidberard98/386/orig -> origin/gh/davidberard98/386/orig 2025-09-07T07:51:36.0351307Z * [new branch] gh/davidberard98/391/base -> origin/gh/davidberard98/391/base 2025-09-07T07:51:36.0352846Z * [new branch] gh/davidberard98/391/head -> origin/gh/davidberard98/391/head 2025-09-07T07:51:36.0354428Z * [new branch] gh/davidberard98/391/orig -> origin/gh/davidberard98/391/orig 2025-09-07T07:51:36.0357069Z * [new branch] gh/davidberard98/392/base -> origin/gh/davidberard98/392/base 2025-09-07T07:51:36.0358669Z * [new branch] gh/davidberard98/392/head -> origin/gh/davidberard98/392/head 2025-09-07T07:51:36.0360144Z * [new branch] gh/davidberard98/392/orig -> origin/gh/davidberard98/392/orig 2025-09-07T07:51:36.0362601Z * [new branch] gh/davidberard98/394/base -> origin/gh/davidberard98/394/base 2025-09-07T07:51:36.0364314Z * [new branch] gh/davidberard98/394/head -> origin/gh/davidberard98/394/head 2025-09-07T07:51:36.0366212Z * [new branch] gh/davidberard98/394/orig -> origin/gh/davidberard98/394/orig 2025-09-07T07:51:36.0368499Z * [new branch] gh/davidberard98/396/base -> origin/gh/davidberard98/396/base 2025-09-07T07:51:36.0370114Z * [new branch] gh/davidberard98/396/head -> origin/gh/davidberard98/396/head 2025-09-07T07:51:36.0371754Z * [new branch] gh/davidberard98/396/orig -> origin/gh/davidberard98/396/orig 2025-09-07T07:51:36.0374480Z * [new branch] gh/davidberard98/397/base -> origin/gh/davidberard98/397/base 2025-09-07T07:51:36.0376362Z * [new branch] gh/davidberard98/397/head -> origin/gh/davidberard98/397/head 2025-09-07T07:51:36.0377950Z * [new branch] gh/davidberard98/397/orig -> origin/gh/davidberard98/397/orig 2025-09-07T07:51:36.0380304Z * [new branch] gh/davidberard98/398/base -> origin/gh/davidberard98/398/base 2025-09-07T07:51:36.0381898Z * [new branch] gh/davidberard98/398/head -> origin/gh/davidberard98/398/head 2025-09-07T07:51:36.0383570Z * [new branch] gh/davidberard98/398/orig -> origin/gh/davidberard98/398/orig 2025-09-07T07:51:36.0386258Z * [new branch] gh/davidberard98/399/base -> origin/gh/davidberard98/399/base 2025-09-07T07:51:36.0387928Z * [new branch] gh/davidberard98/399/head -> origin/gh/davidberard98/399/head 2025-09-07T07:51:36.0389495Z * [new branch] gh/davidberard98/399/orig -> origin/gh/davidberard98/399/orig 2025-09-07T07:51:36.0391943Z * [new branch] gh/davidberard98/400/base -> origin/gh/davidberard98/400/base 2025-09-07T07:51:36.0393569Z * [new branch] gh/davidberard98/400/head -> origin/gh/davidberard98/400/head 2025-09-07T07:51:36.0395257Z * [new branch] gh/davidberard98/400/orig -> origin/gh/davidberard98/400/orig 2025-09-07T07:51:36.0397752Z * [new branch] gh/davidberard98/401/base -> origin/gh/davidberard98/401/base 2025-09-07T07:51:36.0399304Z * [new branch] gh/davidberard98/401/head -> origin/gh/davidberard98/401/head 2025-09-07T07:51:36.0400908Z * [new branch] gh/davidberard98/401/orig -> origin/gh/davidberard98/401/orig 2025-09-07T07:51:36.0403129Z * [new branch] gh/davidberard98/402/base -> origin/gh/davidberard98/402/base 2025-09-07T07:51:36.0404749Z * [new branch] gh/davidberard98/402/head -> origin/gh/davidberard98/402/head 2025-09-07T07:51:36.0406682Z * [new branch] gh/davidberard98/402/orig -> origin/gh/davidberard98/402/orig 2025-09-07T07:51:36.0408984Z * [new branch] gh/davidberard98/403/base -> origin/gh/davidberard98/403/base 2025-09-07T07:51:36.0410718Z * [new branch] gh/davidberard98/403/head -> origin/gh/davidberard98/403/head 2025-09-07T07:51:36.0412302Z * [new branch] gh/davidberard98/403/orig -> origin/gh/davidberard98/403/orig 2025-09-07T07:51:36.0414605Z * [new branch] gh/davidberard98/404/base -> origin/gh/davidberard98/404/base 2025-09-07T07:51:36.0416504Z * [new branch] gh/davidberard98/404/head -> origin/gh/davidberard98/404/head 2025-09-07T07:51:36.0418212Z * [new branch] gh/davidberard98/404/orig -> origin/gh/davidberard98/404/orig 2025-09-07T07:51:36.0420563Z * [new branch] gh/davidberard98/405/base -> origin/gh/davidberard98/405/base 2025-09-07T07:51:36.0422296Z * [new branch] gh/davidberard98/405/head -> origin/gh/davidberard98/405/head 2025-09-07T07:51:36.0423921Z * [new branch] gh/davidberard98/405/orig -> origin/gh/davidberard98/405/orig 2025-09-07T07:51:36.0426650Z * [new branch] gh/davidberard98/406/base -> origin/gh/davidberard98/406/base 2025-09-07T07:51:36.0428363Z * [new branch] gh/davidberard98/406/head -> origin/gh/davidberard98/406/head 2025-09-07T07:51:36.0430075Z * [new branch] gh/davidberard98/406/orig -> origin/gh/davidberard98/406/orig 2025-09-07T07:51:36.0432622Z * [new branch] gh/davidberard98/407/base -> origin/gh/davidberard98/407/base 2025-09-07T07:51:36.0434173Z * [new branch] gh/davidberard98/407/head -> origin/gh/davidberard98/407/head 2025-09-07T07:51:36.0435929Z * [new branch] gh/davidberard98/407/orig -> origin/gh/davidberard98/407/orig 2025-09-07T07:51:36.0438244Z * [new branch] gh/davidberard98/408/base -> origin/gh/davidberard98/408/base 2025-09-07T07:51:36.0439986Z * [new branch] gh/davidberard98/408/head -> origin/gh/davidberard98/408/head 2025-09-07T07:51:36.0441445Z * [new branch] gh/davidberard98/408/orig -> origin/gh/davidberard98/408/orig 2025-09-07T07:51:36.0443522Z * [new branch] gh/davidberard98/409/base -> origin/gh/davidberard98/409/base 2025-09-07T07:51:36.0445312Z * [new branch] gh/davidberard98/409/head -> origin/gh/davidberard98/409/head 2025-09-07T07:51:36.0447213Z * [new branch] gh/davidberard98/409/orig -> origin/gh/davidberard98/409/orig 2025-09-07T07:51:36.0449999Z * [new branch] gh/desertfire/594/base -> origin/gh/desertfire/594/base 2025-09-07T07:51:36.0451631Z * [new branch] gh/desertfire/594/head -> origin/gh/desertfire/594/head 2025-09-07T07:51:36.0453227Z * [new branch] gh/desertfire/594/orig -> origin/gh/desertfire/594/orig 2025-09-07T07:51:36.0455748Z * [new branch] gh/desertfire/595/base -> origin/gh/desertfire/595/base 2025-09-07T07:51:36.0457358Z * [new branch] gh/desertfire/595/head -> origin/gh/desertfire/595/head 2025-09-07T07:51:36.0458908Z * [new branch] gh/desertfire/595/orig -> origin/gh/desertfire/595/orig 2025-09-07T07:51:36.0461169Z * [new branch] gh/desertfire/597/base -> origin/gh/desertfire/597/base 2025-09-07T07:51:36.0462949Z * [new branch] gh/desertfire/597/head -> origin/gh/desertfire/597/head 2025-09-07T07:51:36.0464487Z * [new branch] gh/desertfire/597/orig -> origin/gh/desertfire/597/orig 2025-09-07T07:51:36.0467798Z * [new branch] gh/dharakk/1/base -> origin/gh/dharakk/1/base 2025-09-07T07:51:36.0469460Z * [new branch] gh/dharakk/1/head -> origin/gh/dharakk/1/head 2025-09-07T07:51:36.0472183Z * [new branch] gh/drisspg/149/base -> origin/gh/drisspg/149/base 2025-09-07T07:51:36.0473808Z * [new branch] gh/drisspg/149/head -> origin/gh/drisspg/149/head 2025-09-07T07:51:36.0475559Z * [new branch] gh/drisspg/149/orig -> origin/gh/drisspg/149/orig 2025-09-07T07:51:36.0477935Z * [new branch] gh/drisspg/159/base -> origin/gh/drisspg/159/base 2025-09-07T07:51:36.0479466Z * [new branch] gh/drisspg/159/head -> origin/gh/drisspg/159/head 2025-09-07T07:51:36.0480990Z * [new branch] gh/drisspg/159/orig -> origin/gh/drisspg/159/orig 2025-09-07T07:51:36.0483256Z * [new branch] gh/drisspg/166/base -> origin/gh/drisspg/166/base 2025-09-07T07:51:36.0484853Z * [new branch] gh/drisspg/166/head -> origin/gh/drisspg/166/head 2025-09-07T07:51:36.0486807Z * [new branch] gh/drisspg/166/orig -> origin/gh/drisspg/166/orig 2025-09-07T07:51:36.0489083Z * [new branch] gh/drisspg/170/base -> origin/gh/drisspg/170/base 2025-09-07T07:51:36.0490656Z * [new branch] gh/drisspg/170/head -> origin/gh/drisspg/170/head 2025-09-07T07:51:36.0492215Z * [new branch] gh/drisspg/170/orig -> origin/gh/drisspg/170/orig 2025-09-07T07:51:36.0494504Z * [new branch] gh/drisspg/173/base -> origin/gh/drisspg/173/base 2025-09-07T07:51:36.0496404Z * [new branch] gh/drisspg/173/head -> origin/gh/drisspg/173/head 2025-09-07T07:51:36.0497960Z * [new branch] gh/drisspg/173/orig -> origin/gh/drisspg/173/orig 2025-09-07T07:51:36.0500285Z * [new branch] gh/drisspg/177/base -> origin/gh/drisspg/177/base 2025-09-07T07:51:36.0501817Z * [new branch] gh/drisspg/177/head -> origin/gh/drisspg/177/head 2025-09-07T07:51:36.0503532Z * [new branch] gh/drisspg/177/orig -> origin/gh/drisspg/177/orig 2025-09-07T07:51:36.0506179Z * [new branch] gh/drisspg/178/base -> origin/gh/drisspg/178/base 2025-09-07T07:51:36.0507652Z * [new branch] gh/drisspg/178/head -> origin/gh/drisspg/178/head 2025-09-07T07:51:36.0509071Z * [new branch] gh/drisspg/178/orig -> origin/gh/drisspg/178/orig 2025-09-07T07:51:36.0511390Z * [new branch] gh/drisspg/180/base -> origin/gh/drisspg/180/base 2025-09-07T07:51:36.0512990Z * [new branch] gh/drisspg/180/head -> origin/gh/drisspg/180/head 2025-09-07T07:51:36.0514734Z * [new branch] gh/drisspg/180/orig -> origin/gh/drisspg/180/orig 2025-09-07T07:51:36.0517426Z * [new branch] gh/drisspg/181/base -> origin/gh/drisspg/181/base 2025-09-07T07:51:36.0518932Z * [new branch] gh/drisspg/181/head -> origin/gh/drisspg/181/head 2025-09-07T07:51:36.0520407Z * [new branch] gh/drisspg/181/orig -> origin/gh/drisspg/181/orig 2025-09-07T07:51:36.0522745Z * [new branch] gh/drisspg/182/base -> origin/gh/drisspg/182/base 2025-09-07T07:51:36.0524406Z * [new branch] gh/drisspg/182/head -> origin/gh/drisspg/182/head 2025-09-07T07:51:36.0526800Z * [new branch] gh/drisspg/183/base -> origin/gh/drisspg/183/base 2025-09-07T07:51:36.0528331Z * [new branch] gh/drisspg/183/head -> origin/gh/drisspg/183/head 2025-09-07T07:51:36.0530554Z * [new branch] gh/drisspg/184/base -> origin/gh/drisspg/184/base 2025-09-07T07:51:36.0532107Z * [new branch] gh/drisspg/184/head -> origin/gh/drisspg/184/head 2025-09-07T07:51:36.0534323Z * [new branch] gh/drisspg/185/base -> origin/gh/drisspg/185/base 2025-09-07T07:51:36.0536229Z * [new branch] gh/drisspg/185/head -> origin/gh/drisspg/185/head 2025-09-07T07:51:36.0538491Z * [new branch] gh/drisspg/186/base -> origin/gh/drisspg/186/base 2025-09-07T07:51:36.0540153Z * [new branch] gh/drisspg/186/head -> origin/gh/drisspg/186/head 2025-09-07T07:51:36.0541767Z * [new branch] gh/drisspg/186/orig -> origin/gh/drisspg/186/orig 2025-09-07T07:51:36.0544098Z * [new branch] gh/drisspg/187/base -> origin/gh/drisspg/187/base 2025-09-07T07:51:36.0545988Z * [new branch] gh/drisspg/187/head -> origin/gh/drisspg/187/head 2025-09-07T07:51:36.0547611Z * [new branch] gh/drisspg/187/orig -> origin/gh/drisspg/187/orig 2025-09-07T07:51:36.0549884Z * [new branch] gh/drisspg/188/base -> origin/gh/drisspg/188/base 2025-09-07T07:51:36.0551435Z * [new branch] gh/drisspg/188/head -> origin/gh/drisspg/188/head 2025-09-07T07:51:36.0552995Z * [new branch] gh/drisspg/188/orig -> origin/gh/drisspg/188/orig 2025-09-07T07:51:36.0555363Z * [new branch] gh/drisspg/189/base -> origin/gh/drisspg/189/base 2025-09-07T07:51:36.0557301Z * [new branch] gh/drisspg/189/head -> origin/gh/drisspg/189/head 2025-09-07T07:51:36.0558729Z * [new branch] gh/drisspg/189/orig -> origin/gh/drisspg/189/orig 2025-09-07T07:51:36.0561063Z * [new branch] gh/drisspg/190/base -> origin/gh/drisspg/190/base 2025-09-07T07:51:36.0562651Z * [new branch] gh/drisspg/190/head -> origin/gh/drisspg/190/head 2025-09-07T07:51:36.0564253Z * [new branch] gh/drisspg/190/orig -> origin/gh/drisspg/190/orig 2025-09-07T07:51:36.0566839Z * [new branch] gh/drisspg/191/base -> origin/gh/drisspg/191/base 2025-09-07T07:51:36.0568403Z * [new branch] gh/drisspg/191/head -> origin/gh/drisspg/191/head 2025-09-07T07:51:36.0569901Z * [new branch] gh/drisspg/191/orig -> origin/gh/drisspg/191/orig 2025-09-07T07:51:36.0572149Z * [new branch] gh/drisspg/192/base -> origin/gh/drisspg/192/base 2025-09-07T07:51:36.0574024Z * [new branch] gh/drisspg/192/head -> origin/gh/drisspg/192/head 2025-09-07T07:51:36.0575602Z * [new branch] gh/drisspg/192/orig -> origin/gh/drisspg/192/orig 2025-09-07T07:51:36.0578023Z * [new branch] gh/drisspg/193/base -> origin/gh/drisspg/193/base 2025-09-07T07:51:36.0579567Z * [new branch] gh/drisspg/193/head -> origin/gh/drisspg/193/head 2025-09-07T07:51:36.0581145Z * [new branch] gh/drisspg/193/orig -> origin/gh/drisspg/193/orig 2025-09-07T07:51:36.0583507Z * [new branch] gh/drisspg/194/base -> origin/gh/drisspg/194/base 2025-09-07T07:51:36.0585252Z * [new branch] gh/drisspg/194/head -> origin/gh/drisspg/194/head 2025-09-07T07:51:36.0587014Z * [new branch] gh/drisspg/194/orig -> origin/gh/drisspg/194/orig 2025-09-07T07:51:36.0589325Z * [new branch] gh/drisspg/195/base -> origin/gh/drisspg/195/base 2025-09-07T07:51:36.0590889Z * [new branch] gh/drisspg/195/head -> origin/gh/drisspg/195/head 2025-09-07T07:51:36.0592465Z * [new branch] gh/drisspg/195/orig -> origin/gh/drisspg/195/orig 2025-09-07T07:51:36.0594738Z * [new branch] gh/drisspg/196/base -> origin/gh/drisspg/196/base 2025-09-07T07:51:36.0596740Z * [new branch] gh/drisspg/196/head -> origin/gh/drisspg/196/head 2025-09-07T07:51:36.0598378Z * [new branch] gh/drisspg/196/orig -> origin/gh/drisspg/196/orig 2025-09-07T07:51:36.0600628Z * [new branch] gh/drisspg/197/base -> origin/gh/drisspg/197/base 2025-09-07T07:51:36.0602217Z * [new branch] gh/drisspg/197/head -> origin/gh/drisspg/197/head 2025-09-07T07:51:36.0603772Z * [new branch] gh/drisspg/197/orig -> origin/gh/drisspg/197/orig 2025-09-07T07:51:36.0606350Z * [new branch] gh/drisspg/198/base -> origin/gh/drisspg/198/base 2025-09-07T07:51:36.0607940Z * [new branch] gh/drisspg/198/head -> origin/gh/drisspg/198/head 2025-09-07T07:51:36.0609705Z * [new branch] gh/drisspg/198/orig -> origin/gh/drisspg/198/orig 2025-09-07T07:51:36.0611980Z * [new branch] gh/drisspg/199/base -> origin/gh/drisspg/199/base 2025-09-07T07:51:36.0613603Z * [new branch] gh/drisspg/199/head -> origin/gh/drisspg/199/head 2025-09-07T07:51:36.0615238Z * [new branch] gh/drisspg/199/orig -> origin/gh/drisspg/199/orig 2025-09-07T07:51:36.0618314Z * [new branch] gh/dsjohns2/1/base -> origin/gh/dsjohns2/1/base 2025-09-07T07:51:36.0620014Z * [new branch] gh/dsjohns2/1/head -> origin/gh/dsjohns2/1/head 2025-09-07T07:51:36.0622760Z * [new branch] gh/eellison/784/base -> origin/gh/eellison/784/base 2025-09-07T07:51:36.0624355Z * [new branch] gh/eellison/784/head -> origin/gh/eellison/784/head 2025-09-07T07:51:36.0626250Z * [new branch] gh/eellison/784/orig -> origin/gh/eellison/784/orig 2025-09-07T07:51:36.0628814Z * [new branch] gh/eellison/785/base -> origin/gh/eellison/785/base 2025-09-07T07:51:36.0630392Z * [new branch] gh/eellison/785/head -> origin/gh/eellison/785/head 2025-09-07T07:51:36.0632010Z * [new branch] gh/eellison/785/orig -> origin/gh/eellison/785/orig 2025-09-07T07:51:36.0634307Z * [new branch] gh/eellison/789/base -> origin/gh/eellison/789/base 2025-09-07T07:51:36.0636246Z * [new branch] gh/eellison/789/head -> origin/gh/eellison/789/head 2025-09-07T07:51:36.0637837Z * [new branch] gh/eellison/789/orig -> origin/gh/eellison/789/orig 2025-09-07T07:51:36.0640066Z * [new branch] gh/eellison/800/base -> origin/gh/eellison/800/base 2025-09-07T07:51:36.0641801Z * [new branch] gh/eellison/800/head -> origin/gh/eellison/800/head 2025-09-07T07:51:36.0643214Z * [new branch] gh/eellison/800/orig -> origin/gh/eellison/800/orig 2025-09-07T07:51:36.0645632Z * [new branch] gh/eellison/801/base -> origin/gh/eellison/801/base 2025-09-07T07:51:36.0647312Z * [new branch] gh/eellison/801/head -> origin/gh/eellison/801/head 2025-09-07T07:51:36.0648839Z * [new branch] gh/eellison/801/orig -> origin/gh/eellison/801/orig 2025-09-07T07:51:36.0651129Z * [new branch] gh/eellison/802/base -> origin/gh/eellison/802/base 2025-09-07T07:51:36.0652759Z * [new branch] gh/eellison/802/head -> origin/gh/eellison/802/head 2025-09-07T07:51:36.0654394Z * [new branch] gh/eellison/802/orig -> origin/gh/eellison/802/orig 2025-09-07T07:51:36.0656885Z * [new branch] gh/eellison/805/base -> origin/gh/eellison/805/base 2025-09-07T07:51:36.0658456Z * [new branch] gh/eellison/805/head -> origin/gh/eellison/805/head 2025-09-07T07:51:36.0660101Z * [new branch] gh/eellison/805/orig -> origin/gh/eellison/805/orig 2025-09-07T07:51:36.0662451Z * [new branch] gh/eellison/808/base -> origin/gh/eellison/808/base 2025-09-07T07:51:36.0664120Z * [new branch] gh/eellison/808/head -> origin/gh/eellison/808/head 2025-09-07T07:51:36.0666130Z * [new branch] gh/eellison/808/orig -> origin/gh/eellison/808/orig 2025-09-07T07:51:36.0668409Z * [new branch] gh/eellison/809/base -> origin/gh/eellison/809/base 2025-09-07T07:51:36.0670486Z * [new branch] gh/eellison/809/head -> origin/gh/eellison/809/head 2025-09-07T07:51:36.0671562Z * [new branch] gh/eellison/809/orig -> origin/gh/eellison/809/orig 2025-09-07T07:51:36.0673923Z * [new branch] gh/eellison/813/base -> origin/gh/eellison/813/base 2025-09-07T07:51:36.0675908Z * [new branch] gh/eellison/813/head -> origin/gh/eellison/813/head 2025-09-07T07:51:36.0677511Z * [new branch] gh/eellison/813/orig -> origin/gh/eellison/813/orig 2025-09-07T07:51:36.0679760Z * [new branch] gh/eellison/814/base -> origin/gh/eellison/814/base 2025-09-07T07:51:36.0681387Z * [new branch] gh/eellison/814/head -> origin/gh/eellison/814/head 2025-09-07T07:51:36.0682920Z * [new branch] gh/eellison/814/orig -> origin/gh/eellison/814/orig 2025-09-07T07:51:36.0685834Z * [new branch] gh/eellison/815/base -> origin/gh/eellison/815/base 2025-09-07T07:51:36.0687451Z * [new branch] gh/eellison/815/head -> origin/gh/eellison/815/head 2025-09-07T07:51:36.0689077Z * [new branch] gh/eellison/815/orig -> origin/gh/eellison/815/orig 2025-09-07T07:51:36.0691350Z * [new branch] gh/eellison/816/base -> origin/gh/eellison/816/base 2025-09-07T07:51:36.0692945Z * [new branch] gh/eellison/816/head -> origin/gh/eellison/816/head 2025-09-07T07:51:36.0694499Z * [new branch] gh/eellison/816/orig -> origin/gh/eellison/816/orig 2025-09-07T07:51:36.0697064Z * [new branch] gh/eellison/817/base -> origin/gh/eellison/817/base 2025-09-07T07:51:36.0698571Z * [new branch] gh/eellison/817/head -> origin/gh/eellison/817/head 2025-09-07T07:51:36.0700215Z * [new branch] gh/eellison/817/orig -> origin/gh/eellison/817/orig 2025-09-07T07:51:36.0702493Z * [new branch] gh/eellison/818/base -> origin/gh/eellison/818/base 2025-09-07T07:51:36.0704097Z * [new branch] gh/eellison/818/head -> origin/gh/eellison/818/head 2025-09-07T07:51:36.0706117Z * [new branch] gh/eellison/818/orig -> origin/gh/eellison/818/orig 2025-09-07T07:51:36.0708721Z * [new branch] gh/eellison/819/base -> origin/gh/eellison/819/base 2025-09-07T07:51:36.0710104Z * [new branch] gh/eellison/819/head -> origin/gh/eellison/819/head 2025-09-07T07:51:36.0711703Z * [new branch] gh/eellison/819/orig -> origin/gh/eellison/819/orig 2025-09-07T07:51:36.0714153Z * [new branch] gh/eellison/820/base -> origin/gh/eellison/820/base 2025-09-07T07:51:36.0716196Z * [new branch] gh/eellison/820/head -> origin/gh/eellison/820/head 2025-09-07T07:51:36.0717802Z * [new branch] gh/eellison/820/orig -> origin/gh/eellison/820/orig 2025-09-07T07:51:36.0719893Z * [new branch] gh/eellison/821/base -> origin/gh/eellison/821/base 2025-09-07T07:51:36.0721461Z * [new branch] gh/eellison/821/head -> origin/gh/eellison/821/head 2025-09-07T07:51:36.0723079Z * [new branch] gh/eellison/821/orig -> origin/gh/eellison/821/orig 2025-09-07T07:51:36.0725563Z * [new branch] gh/eellison/822/base -> origin/gh/eellison/822/base 2025-09-07T07:51:36.0727246Z * [new branch] gh/eellison/822/head -> origin/gh/eellison/822/head 2025-09-07T07:51:36.0728820Z * [new branch] gh/eellison/822/orig -> origin/gh/eellison/822/orig 2025-09-07T07:51:36.0731291Z * [new branch] gh/eellison/823/base -> origin/gh/eellison/823/base 2025-09-07T07:51:36.0732873Z * [new branch] gh/eellison/823/head -> origin/gh/eellison/823/head 2025-09-07T07:51:36.0734443Z * [new branch] gh/eellison/823/orig -> origin/gh/eellison/823/orig 2025-09-07T07:51:36.0737601Z * [new branch] gh/etaf/132/base -> origin/gh/etaf/132/base 2025-09-07T07:51:36.0739066Z * [new branch] gh/etaf/132/head -> origin/gh/etaf/132/head 2025-09-07T07:51:36.0740566Z * [new branch] gh/etaf/132/orig -> origin/gh/etaf/132/orig 2025-09-07T07:51:36.0742933Z * [new branch] gh/etaf/138/base -> origin/gh/etaf/138/base 2025-09-07T07:51:36.0744465Z * [new branch] gh/etaf/138/head -> origin/gh/etaf/138/head 2025-09-07T07:51:36.0746321Z * [new branch] gh/etaf/138/orig -> origin/gh/etaf/138/orig 2025-09-07T07:51:36.0748642Z * [new branch] gh/etaf/140/base -> origin/gh/etaf/140/base 2025-09-07T07:51:36.0751013Z * [new branch] gh/etaf/140/head -> origin/gh/etaf/140/head 2025-09-07T07:51:36.0752593Z * [new branch] gh/etaf/140/orig -> origin/gh/etaf/140/orig 2025-09-07T07:51:36.0754842Z * [new branch] gh/etaf/143/base -> origin/gh/etaf/143/base 2025-09-07T07:51:36.0756759Z * [new branch] gh/etaf/143/head -> origin/gh/etaf/143/head 2025-09-07T07:51:36.0758368Z * [new branch] gh/etaf/143/orig -> origin/gh/etaf/143/orig 2025-09-07T07:51:36.0760700Z * [new branch] gh/etaf/147/base -> origin/gh/etaf/147/base 2025-09-07T07:51:36.0762278Z * [new branch] gh/etaf/147/head -> origin/gh/etaf/147/head 2025-09-07T07:51:36.0764644Z * [new branch] gh/etaf/151/base -> origin/gh/etaf/151/base 2025-09-07T07:51:36.0766968Z * [new branch] gh/etaf/151/head -> origin/gh/etaf/151/head 2025-09-07T07:51:36.0768536Z * [new branch] gh/etaf/151/orig -> origin/gh/etaf/151/orig 2025-09-07T07:51:36.0770921Z * [new branch] gh/etaf/152/base -> origin/gh/etaf/152/base 2025-09-07T07:51:36.0772542Z * [new branch] gh/etaf/152/head -> origin/gh/etaf/152/head 2025-09-07T07:51:36.0774125Z * [new branch] gh/etaf/152/orig -> origin/gh/etaf/152/orig 2025-09-07T07:51:36.0776845Z * [new branch] gh/etaf/153/base -> origin/gh/etaf/153/base 2025-09-07T07:51:36.0778692Z * [new branch] gh/etaf/153/head -> origin/gh/etaf/153/head 2025-09-07T07:51:36.0780117Z * [new branch] gh/etaf/153/orig -> origin/gh/etaf/153/orig 2025-09-07T07:51:36.0782526Z * [new branch] gh/etaf/154/base -> origin/gh/etaf/154/base 2025-09-07T07:51:36.0784230Z * [new branch] gh/etaf/154/head -> origin/gh/etaf/154/head 2025-09-07T07:51:36.0786008Z * [new branch] gh/etaf/154/orig -> origin/gh/etaf/154/orig 2025-09-07T07:51:36.0788480Z * [new branch] gh/etaf/155/base -> origin/gh/etaf/155/base 2025-09-07T07:51:36.0790082Z * [new branch] gh/etaf/155/head -> origin/gh/etaf/155/head 2025-09-07T07:51:36.0791685Z * [new branch] gh/etaf/155/orig -> origin/gh/etaf/155/orig 2025-09-07T07:51:36.0793894Z * [new branch] gh/etaf/156/base -> origin/gh/etaf/156/base 2025-09-07T07:51:36.0795752Z * [new branch] gh/etaf/156/head -> origin/gh/etaf/156/head 2025-09-07T07:51:36.0797318Z * [new branch] gh/etaf/156/orig -> origin/gh/etaf/156/orig 2025-09-07T07:51:36.0799751Z * [new branch] gh/etaf/157/base -> origin/gh/etaf/157/base 2025-09-07T07:51:36.0801386Z * [new branch] gh/etaf/157/head -> origin/gh/etaf/157/head 2025-09-07T07:51:36.0803232Z * [new branch] gh/etaf/157/orig -> origin/gh/etaf/157/orig 2025-09-07T07:51:36.0805716Z * [new branch] gh/etaf/158/base -> origin/gh/etaf/158/base 2025-09-07T07:51:36.0807486Z * [new branch] gh/etaf/158/head -> origin/gh/etaf/158/head 2025-09-07T07:51:36.0809048Z * [new branch] gh/etaf/158/orig -> origin/gh/etaf/158/orig 2025-09-07T07:51:36.0811337Z * [new branch] gh/etaf/159/base -> origin/gh/etaf/159/base 2025-09-07T07:51:36.0812978Z * [new branch] gh/etaf/159/head -> origin/gh/etaf/159/head 2025-09-07T07:51:36.0814596Z * [new branch] gh/etaf/159/orig -> origin/gh/etaf/159/orig 2025-09-07T07:51:36.0817164Z * [new branch] gh/etaf/160/base -> origin/gh/etaf/160/base 2025-09-07T07:51:36.0818789Z * [new branch] gh/etaf/160/head -> origin/gh/etaf/160/head 2025-09-07T07:51:36.0820390Z * [new branch] gh/etaf/160/orig -> origin/gh/etaf/160/orig 2025-09-07T07:51:36.0822770Z * [new branch] gh/etaf/161/base -> origin/gh/etaf/161/base 2025-09-07T07:51:36.0824360Z * [new branch] gh/etaf/161/head -> origin/gh/etaf/161/head 2025-09-07T07:51:36.0826205Z * [new branch] gh/etaf/161/orig -> origin/gh/etaf/161/orig 2025-09-07T07:51:36.0828623Z * [new branch] gh/etaf/162/base -> origin/gh/etaf/162/base 2025-09-07T07:51:36.0830225Z * [new branch] gh/etaf/162/head -> origin/gh/etaf/162/head 2025-09-07T07:51:36.0831775Z * [new branch] gh/etaf/162/orig -> origin/gh/etaf/162/orig 2025-09-07T07:51:36.0834032Z * [new branch] gh/etaf/163/base -> origin/gh/etaf/163/base 2025-09-07T07:51:36.0835850Z * [new branch] gh/etaf/163/head -> origin/gh/etaf/163/head 2025-09-07T07:51:36.0837429Z * [new branch] gh/etaf/163/orig -> origin/gh/etaf/163/orig 2025-09-07T07:51:36.0839822Z * [new branch] gh/etaf/164/base -> origin/gh/etaf/164/base 2025-09-07T07:51:36.0841396Z * [new branch] gh/etaf/164/head -> origin/gh/etaf/164/head 2025-09-07T07:51:36.0842956Z * [new branch] gh/etaf/164/orig -> origin/gh/etaf/164/orig 2025-09-07T07:51:36.0845462Z * [new branch] gh/etaf/165/base -> origin/gh/etaf/165/base 2025-09-07T07:51:36.0847365Z * [new branch] gh/etaf/165/orig -> origin/gh/etaf/165/orig 2025-09-07T07:51:36.0849456Z * [new branch] gh/etaf/166/base -> origin/gh/etaf/166/base 2025-09-07T07:51:36.0851048Z * [new branch] gh/etaf/166/head -> origin/gh/etaf/166/head 2025-09-07T07:51:36.0852572Z * [new branch] gh/etaf/166/orig -> origin/gh/etaf/166/orig 2025-09-07T07:51:36.0854862Z * [new branch] gh/etaf/167/base -> origin/gh/etaf/167/base 2025-09-07T07:51:36.0856807Z * [new branch] gh/etaf/167/head -> origin/gh/etaf/167/head 2025-09-07T07:51:36.0858469Z * [new branch] gh/etaf/167/orig -> origin/gh/etaf/167/orig 2025-09-07T07:51:36.0860730Z * [new branch] gh/etaf/168/base -> origin/gh/etaf/168/base 2025-09-07T07:51:36.0862493Z * [new branch] gh/etaf/168/head -> origin/gh/etaf/168/head 2025-09-07T07:51:36.0864275Z * [new branch] gh/etaf/168/orig -> origin/gh/etaf/168/orig 2025-09-07T07:51:36.0866939Z * [new branch] gh/etaf/169/base -> origin/gh/etaf/169/base 2025-09-07T07:51:36.0868551Z * [new branch] gh/etaf/169/head -> origin/gh/etaf/169/head 2025-09-07T07:51:36.0870076Z * [new branch] gh/etaf/169/orig -> origin/gh/etaf/169/orig 2025-09-07T07:51:36.0872978Z * [new branch] gh/exclamaforte/1/base -> origin/gh/exclamaforte/1/base 2025-09-07T07:51:36.0874711Z * [new branch] gh/exclamaforte/1/head -> origin/gh/exclamaforte/1/head 2025-09-07T07:51:36.0877016Z * [new branch] gh/exclamaforte/2/base -> origin/gh/exclamaforte/2/base 2025-09-07T07:51:36.0878485Z * [new branch] gh/exclamaforte/2/head -> origin/gh/exclamaforte/2/head 2025-09-07T07:51:36.0880811Z * [new branch] gh/exclamaforte/3/base -> origin/gh/exclamaforte/3/base 2025-09-07T07:51:36.0882417Z * [new branch] gh/exclamaforte/3/head -> origin/gh/exclamaforte/3/head 2025-09-07T07:51:36.0884592Z * [new branch] gh/exclamaforte/4/base -> origin/gh/exclamaforte/4/base 2025-09-07T07:51:36.0886535Z * [new branch] gh/exclamaforte/4/head -> origin/gh/exclamaforte/4/head 2025-09-07T07:51:36.0889364Z * [new branch] gh/ezyang/2374/base -> origin/gh/ezyang/2374/base 2025-09-07T07:51:36.0890976Z * [new branch] gh/ezyang/2374/head -> origin/gh/ezyang/2374/head 2025-09-07T07:51:36.0892587Z * [new branch] gh/ezyang/2374/orig -> origin/gh/ezyang/2374/orig 2025-09-07T07:51:36.0894832Z * [new branch] gh/ezyang/2973/base -> origin/gh/ezyang/2973/base 2025-09-07T07:51:36.0896686Z * [new branch] gh/ezyang/2973/head -> origin/gh/ezyang/2973/head 2025-09-07T07:51:36.0898232Z * [new branch] gh/ezyang/2973/orig -> origin/gh/ezyang/2973/orig 2025-09-07T07:51:36.0900730Z * [new branch] gh/ezyang/2974/base -> origin/gh/ezyang/2974/base 2025-09-07T07:51:36.0903107Z * [new branch] gh/ezyang/2974/head -> origin/gh/ezyang/2974/head 2025-09-07T07:51:36.0904114Z * [new branch] gh/ezyang/2974/orig -> origin/gh/ezyang/2974/orig 2025-09-07T07:51:36.0906669Z * [new branch] gh/ezyang/3074/base -> origin/gh/ezyang/3074/base 2025-09-07T07:51:36.0908267Z * [new branch] gh/ezyang/3074/head -> origin/gh/ezyang/3074/head 2025-09-07T07:51:36.0909837Z * [new branch] gh/ezyang/3074/orig -> origin/gh/ezyang/3074/orig 2025-09-07T07:51:36.0912079Z * [new branch] gh/ezyang/3088/base -> origin/gh/ezyang/3088/base 2025-09-07T07:51:36.0913616Z * [new branch] gh/ezyang/3088/head -> origin/gh/ezyang/3088/head 2025-09-07T07:51:36.0915513Z * [new branch] gh/ezyang/3088/orig -> origin/gh/ezyang/3088/orig 2025-09-07T07:51:36.0917766Z * [new branch] gh/ezyang/3092/base -> origin/gh/ezyang/3092/base 2025-09-07T07:51:36.0919378Z * [new branch] gh/ezyang/3092/head -> origin/gh/ezyang/3092/head 2025-09-07T07:51:36.0920931Z * [new branch] gh/ezyang/3092/orig -> origin/gh/ezyang/3092/orig 2025-09-07T07:51:36.0923137Z * [new branch] gh/ezyang/3103/base -> origin/gh/ezyang/3103/base 2025-09-07T07:51:36.0924703Z * [new branch] gh/ezyang/3103/head -> origin/gh/ezyang/3103/head 2025-09-07T07:51:36.0926592Z * [new branch] gh/ezyang/3103/orig -> origin/gh/ezyang/3103/orig 2025-09-07T07:51:36.0928782Z * [new branch] gh/ezyang/3105/base -> origin/gh/ezyang/3105/base 2025-09-07T07:51:36.0930395Z * [new branch] gh/ezyang/3105/head -> origin/gh/ezyang/3105/head 2025-09-07T07:51:36.0932021Z * [new branch] gh/ezyang/3105/orig -> origin/gh/ezyang/3105/orig 2025-09-07T07:51:36.0934241Z * [new branch] gh/ezyang/3114/base -> origin/gh/ezyang/3114/base 2025-09-07T07:51:36.0936192Z * [new branch] gh/ezyang/3114/head -> origin/gh/ezyang/3114/head 2025-09-07T07:51:36.0937709Z * [new branch] gh/ezyang/3114/orig -> origin/gh/ezyang/3114/orig 2025-09-07T07:51:36.0940045Z * [new branch] gh/ezyang/3116/base -> origin/gh/ezyang/3116/base 2025-09-07T07:51:36.0941675Z * [new branch] gh/ezyang/3116/head -> origin/gh/ezyang/3116/head 2025-09-07T07:51:36.0943238Z * [new branch] gh/ezyang/3116/orig -> origin/gh/ezyang/3116/orig 2025-09-07T07:51:36.0945784Z * [new branch] gh/ezyang/3120/base -> origin/gh/ezyang/3120/base 2025-09-07T07:51:36.0947384Z * [new branch] gh/ezyang/3120/head -> origin/gh/ezyang/3120/head 2025-09-07T07:51:36.0948950Z * [new branch] gh/ezyang/3120/orig -> origin/gh/ezyang/3120/orig 2025-09-07T07:51:36.0951200Z * [new branch] gh/ezyang/3122/base -> origin/gh/ezyang/3122/base 2025-09-07T07:51:36.0952902Z * [new branch] gh/ezyang/3122/head -> origin/gh/ezyang/3122/head 2025-09-07T07:51:36.0954428Z * [new branch] gh/ezyang/3122/orig -> origin/gh/ezyang/3122/orig 2025-09-07T07:51:36.0957193Z * [new branch] gh/ezyang/3123/base -> origin/gh/ezyang/3123/base 2025-09-07T07:51:36.0958723Z * [new branch] gh/ezyang/3123/head -> origin/gh/ezyang/3123/head 2025-09-07T07:51:36.0960281Z * [new branch] gh/ezyang/3123/orig -> origin/gh/ezyang/3123/orig 2025-09-07T07:51:36.0962516Z * [new branch] gh/ezyang/3125/base -> origin/gh/ezyang/3125/base 2025-09-07T07:51:36.0964123Z * [new branch] gh/ezyang/3125/head -> origin/gh/ezyang/3125/head 2025-09-07T07:51:36.0965939Z * [new branch] gh/ezyang/3125/orig -> origin/gh/ezyang/3125/orig 2025-09-07T07:51:36.0968214Z * [new branch] gh/ezyang/3126/base -> origin/gh/ezyang/3126/base 2025-09-07T07:51:36.0969878Z * [new branch] gh/ezyang/3126/head -> origin/gh/ezyang/3126/head 2025-09-07T07:51:36.0971396Z * [new branch] gh/ezyang/3126/orig -> origin/gh/ezyang/3126/orig 2025-09-07T07:51:36.0973567Z * [new branch] gh/ezyang/3127/base -> origin/gh/ezyang/3127/base 2025-09-07T07:51:36.0975256Z * [new branch] gh/ezyang/3127/head -> origin/gh/ezyang/3127/head 2025-09-07T07:51:36.0977138Z * [new branch] gh/ezyang/3127/orig -> origin/gh/ezyang/3127/orig 2025-09-07T07:51:36.0979337Z * [new branch] gh/ezyang/3128/base -> origin/gh/ezyang/3128/base 2025-09-07T07:51:36.0980897Z * [new branch] gh/ezyang/3128/head -> origin/gh/ezyang/3128/head 2025-09-07T07:51:36.0982812Z * [new branch] gh/ezyang/3128/orig -> origin/gh/ezyang/3128/orig 2025-09-07T07:51:36.0984890Z * [new branch] gh/ezyang/3129/base -> origin/gh/ezyang/3129/base 2025-09-07T07:51:36.0986773Z * [new branch] gh/ezyang/3129/head -> origin/gh/ezyang/3129/head 2025-09-07T07:51:36.0988370Z * [new branch] gh/ezyang/3129/orig -> origin/gh/ezyang/3129/orig 2025-09-07T07:51:36.0990637Z * [new branch] gh/ezyang/3130/base -> origin/gh/ezyang/3130/base 2025-09-07T07:51:36.0992170Z * [new branch] gh/ezyang/3130/head -> origin/gh/ezyang/3130/head 2025-09-07T07:51:36.0993939Z * [new branch] gh/ezyang/3130/orig -> origin/gh/ezyang/3130/orig 2025-09-07T07:51:36.0996596Z * [new branch] gh/ezyang/3131/base -> origin/gh/ezyang/3131/base 2025-09-07T07:51:36.0998200Z * [new branch] gh/ezyang/3131/head -> origin/gh/ezyang/3131/head 2025-09-07T07:51:36.0999771Z * [new branch] gh/ezyang/3131/orig -> origin/gh/ezyang/3131/orig 2025-09-07T07:51:36.1001977Z * [new branch] gh/ezyang/3132/base -> origin/gh/ezyang/3132/base 2025-09-07T07:51:36.1003613Z * [new branch] gh/ezyang/3132/head -> origin/gh/ezyang/3132/head 2025-09-07T07:51:36.1005298Z * [new branch] gh/ezyang/3132/orig -> origin/gh/ezyang/3132/orig 2025-09-07T07:51:36.1007654Z * [new branch] gh/ezyang/3133/base -> origin/gh/ezyang/3133/base 2025-09-07T07:51:36.1009254Z * [new branch] gh/ezyang/3133/head -> origin/gh/ezyang/3133/head 2025-09-07T07:51:36.1010833Z * [new branch] gh/ezyang/3133/orig -> origin/gh/ezyang/3133/orig 2025-09-07T07:51:36.1013096Z * [new branch] gh/ezyang/3134/base -> origin/gh/ezyang/3134/base 2025-09-07T07:51:36.1014798Z * [new branch] gh/ezyang/3134/head -> origin/gh/ezyang/3134/head 2025-09-07T07:51:36.1016679Z * [new branch] gh/ezyang/3134/orig -> origin/gh/ezyang/3134/orig 2025-09-07T07:51:36.1018892Z * [new branch] gh/ezyang/3135/base -> origin/gh/ezyang/3135/base 2025-09-07T07:51:36.1020486Z * [new branch] gh/ezyang/3135/head -> origin/gh/ezyang/3135/head 2025-09-07T07:51:36.1022296Z * [new branch] gh/ezyang/3135/orig -> origin/gh/ezyang/3135/orig 2025-09-07T07:51:36.1024569Z * [new branch] gh/ezyang/3136/base -> origin/gh/ezyang/3136/base 2025-09-07T07:51:36.1026458Z * [new branch] gh/ezyang/3136/head -> origin/gh/ezyang/3136/head 2025-09-07T07:51:36.1028005Z * [new branch] gh/ezyang/3136/orig -> origin/gh/ezyang/3136/orig 2025-09-07T07:51:36.1030233Z * [new branch] gh/ezyang/3137/base -> origin/gh/ezyang/3137/base 2025-09-07T07:51:36.1031805Z * [new branch] gh/ezyang/3137/head -> origin/gh/ezyang/3137/head 2025-09-07T07:51:36.1033404Z * [new branch] gh/ezyang/3137/orig -> origin/gh/ezyang/3137/orig 2025-09-07T07:51:36.1035932Z * [new branch] gh/ezyang/3138/base -> origin/gh/ezyang/3138/base 2025-09-07T07:51:36.1037454Z * [new branch] gh/ezyang/3138/head -> origin/gh/ezyang/3138/head 2025-09-07T07:51:36.1039004Z * [new branch] gh/ezyang/3138/orig -> origin/gh/ezyang/3138/orig 2025-09-07T07:51:36.1041269Z * [new branch] gh/ezyang/3139/base -> origin/gh/ezyang/3139/base 2025-09-07T07:51:36.1042803Z * [new branch] gh/ezyang/3139/head -> origin/gh/ezyang/3139/head 2025-09-07T07:51:36.1044315Z * [new branch] gh/ezyang/3139/orig -> origin/gh/ezyang/3139/orig 2025-09-07T07:51:36.1047157Z * [new branch] gh/ezyang/3140/base -> origin/gh/ezyang/3140/base 2025-09-07T07:51:36.1048873Z * [new branch] gh/ezyang/3140/head -> origin/gh/ezyang/3140/head 2025-09-07T07:51:36.1050261Z * [new branch] gh/ezyang/3140/orig -> origin/gh/ezyang/3140/orig 2025-09-07T07:51:36.1052603Z * [new branch] gh/ezyang/3141/base -> origin/gh/ezyang/3141/base 2025-09-07T07:51:36.1054269Z * [new branch] gh/ezyang/3141/head -> origin/gh/ezyang/3141/head 2025-09-07T07:51:36.1056095Z * [new branch] gh/ezyang/3141/orig -> origin/gh/ezyang/3141/orig 2025-09-07T07:51:36.1058373Z * [new branch] gh/ezyang/3142/base -> origin/gh/ezyang/3142/base 2025-09-07T07:51:36.1060009Z * [new branch] gh/ezyang/3142/head -> origin/gh/ezyang/3142/head 2025-09-07T07:51:36.1061538Z * [new branch] gh/ezyang/3142/orig -> origin/gh/ezyang/3142/orig 2025-09-07T07:51:36.1063815Z * [new branch] gh/ezyang/3143/base -> origin/gh/ezyang/3143/base 2025-09-07T07:51:36.1065505Z * [new branch] gh/ezyang/3143/head -> origin/gh/ezyang/3143/head 2025-09-07T07:51:36.1067203Z * [new branch] gh/ezyang/3143/orig -> origin/gh/ezyang/3143/orig 2025-09-07T07:51:36.1070116Z * [new branch] gh/fadara01/1/base -> origin/gh/fadara01/1/base 2025-09-07T07:51:36.1072834Z * [new branch] gh/fadara01/1/head -> origin/gh/fadara01/1/head 2025-09-07T07:51:36.1074432Z * [new branch] gh/fadara01/1/orig -> origin/gh/fadara01/1/orig 2025-09-07T07:51:36.1077638Z * [new branch] gh/fduwjj/171/base -> origin/gh/fduwjj/171/base 2025-09-07T07:51:36.1079256Z * [new branch] gh/fduwjj/171/head -> origin/gh/fduwjj/171/head 2025-09-07T07:51:36.1080804Z * [new branch] gh/fduwjj/171/orig -> origin/gh/fduwjj/171/orig 2025-09-07T07:51:36.1083226Z * [new branch] gh/fduwjj/175/base -> origin/gh/fduwjj/175/base 2025-09-07T07:51:36.1085156Z * [new branch] gh/fduwjj/175/head -> origin/gh/fduwjj/175/head 2025-09-07T07:51:36.1086876Z * [new branch] gh/fduwjj/175/orig -> origin/gh/fduwjj/175/orig 2025-09-07T07:51:36.1089237Z * [new branch] gh/fduwjj/176/base -> origin/gh/fduwjj/176/base 2025-09-07T07:51:36.1090926Z * [new branch] gh/fduwjj/176/head -> origin/gh/fduwjj/176/head 2025-09-07T07:51:36.1092476Z * [new branch] gh/fduwjj/176/orig -> origin/gh/fduwjj/176/orig 2025-09-07T07:51:36.1094724Z * [new branch] gh/fduwjj/177/base -> origin/gh/fduwjj/177/base 2025-09-07T07:51:36.1096778Z * [new branch] gh/fduwjj/177/head -> origin/gh/fduwjj/177/head 2025-09-07T07:51:36.1098311Z * [new branch] gh/fduwjj/177/orig -> origin/gh/fduwjj/177/orig 2025-09-07T07:51:36.1100585Z * [new branch] gh/fduwjj/178/base -> origin/gh/fduwjj/178/base 2025-09-07T07:51:36.1102444Z * [new branch] gh/fduwjj/178/head -> origin/gh/fduwjj/178/head 2025-09-07T07:51:36.1103967Z * [new branch] gh/fduwjj/178/orig -> origin/gh/fduwjj/178/orig 2025-09-07T07:51:36.1106467Z * [new branch] gh/fduwjj/179/base -> origin/gh/fduwjj/179/base 2025-09-07T07:51:36.1108005Z * [new branch] gh/fduwjj/179/head -> origin/gh/fduwjj/179/head 2025-09-07T07:51:36.1109617Z * [new branch] gh/fduwjj/179/orig -> origin/gh/fduwjj/179/orig 2025-09-07T07:51:36.1111885Z * [new branch] gh/fduwjj/180/base -> origin/gh/fduwjj/180/base 2025-09-07T07:51:36.1113465Z * [new branch] gh/fduwjj/180/head -> origin/gh/fduwjj/180/head 2025-09-07T07:51:36.1115239Z * [new branch] gh/fduwjj/180/orig -> origin/gh/fduwjj/180/orig 2025-09-07T07:51:36.1117606Z * [new branch] gh/fduwjj/181/base -> origin/gh/fduwjj/181/base 2025-09-07T07:51:36.1119345Z * [new branch] gh/fduwjj/181/head -> origin/gh/fduwjj/181/head 2025-09-07T07:51:36.1120747Z * [new branch] gh/fduwjj/181/orig -> origin/gh/fduwjj/181/orig 2025-09-07T07:51:36.1122903Z * [new branch] gh/fduwjj/182/base -> origin/gh/fduwjj/182/base 2025-09-07T07:51:36.1124510Z * [new branch] gh/fduwjj/182/head -> origin/gh/fduwjj/182/head 2025-09-07T07:51:36.1126406Z * [new branch] gh/fduwjj/182/orig -> origin/gh/fduwjj/182/orig 2025-09-07T07:51:36.1128816Z * [new branch] gh/fduwjj/183/base -> origin/gh/fduwjj/183/base 2025-09-07T07:51:36.1130540Z * [new branch] gh/fduwjj/183/head -> origin/gh/fduwjj/183/head 2025-09-07T07:51:36.1132104Z * [new branch] gh/fduwjj/183/orig -> origin/gh/fduwjj/183/orig 2025-09-07T07:51:36.1134581Z * [new branch] gh/fduwjj/184/base -> origin/gh/fduwjj/184/base 2025-09-07T07:51:36.1136688Z * [new branch] gh/fduwjj/184/head -> origin/gh/fduwjj/184/head 2025-09-07T07:51:36.1138181Z * [new branch] gh/fduwjj/184/orig -> origin/gh/fduwjj/184/orig 2025-09-07T07:51:36.1140480Z * [new branch] gh/fduwjj/185/base -> origin/gh/fduwjj/185/base 2025-09-07T07:51:36.1142290Z * [new branch] gh/fduwjj/185/head -> origin/gh/fduwjj/185/head 2025-09-07T07:51:36.1143816Z * [new branch] gh/fduwjj/185/orig -> origin/gh/fduwjj/185/orig 2025-09-07T07:51:36.1146253Z * [new branch] gh/fduwjj/186/base -> origin/gh/fduwjj/186/base 2025-09-07T07:51:36.1147926Z * [new branch] gh/fduwjj/186/head -> origin/gh/fduwjj/186/head 2025-09-07T07:51:36.1149530Z * [new branch] gh/fduwjj/186/orig -> origin/gh/fduwjj/186/orig 2025-09-07T07:51:36.1151612Z * [new branch] gh/fduwjj/187/base -> origin/gh/fduwjj/187/base 2025-09-07T07:51:36.1153134Z * [new branch] gh/fduwjj/187/head -> origin/gh/fduwjj/187/head 2025-09-07T07:51:36.1154706Z * [new branch] gh/fduwjj/187/orig -> origin/gh/fduwjj/187/orig 2025-09-07T07:51:36.1157266Z * [new branch] gh/fduwjj/188/base -> origin/gh/fduwjj/188/base 2025-09-07T07:51:36.1158772Z * [new branch] gh/fduwjj/188/head -> origin/gh/fduwjj/188/head 2025-09-07T07:51:36.1160311Z * [new branch] gh/fduwjj/188/orig -> origin/gh/fduwjj/188/orig 2025-09-07T07:51:36.1162417Z * [new branch] gh/fduwjj/189/base -> origin/gh/fduwjj/189/base 2025-09-07T07:51:36.1164150Z * [new branch] gh/fduwjj/189/head -> origin/gh/fduwjj/189/head 2025-09-07T07:51:36.1165711Z * [new branch] gh/fduwjj/189/orig -> origin/gh/fduwjj/189/orig 2025-09-07T07:51:36.1168040Z * [new branch] gh/fduwjj/190/base -> origin/gh/fduwjj/190/base 2025-09-07T07:51:36.1169594Z * [new branch] gh/fduwjj/190/head -> origin/gh/fduwjj/190/head 2025-09-07T07:51:36.1171307Z * [new branch] gh/fduwjj/190/orig -> origin/gh/fduwjj/190/orig 2025-09-07T07:51:36.1173335Z * [new branch] gh/fduwjj/191/base -> origin/gh/fduwjj/191/base 2025-09-07T07:51:36.1175054Z * [new branch] gh/fduwjj/191/head -> origin/gh/fduwjj/191/head 2025-09-07T07:51:36.1176804Z * [new branch] gh/fduwjj/191/orig -> origin/gh/fduwjj/191/orig 2025-09-07T07:51:36.1179550Z * [new branch] gh/fegin/306/base -> origin/gh/fegin/306/base 2025-09-07T07:51:36.1181158Z * [new branch] gh/fegin/306/head -> origin/gh/fegin/306/head 2025-09-07T07:51:36.1182838Z * [new branch] gh/fegin/306/orig -> origin/gh/fegin/306/orig 2025-09-07T07:51:36.1186683Z * [new branch] gh/fegin/307/base -> origin/gh/fegin/307/base 2025-09-07T07:51:36.1187170Z * [new branch] gh/fegin/307/head -> origin/gh/fegin/307/head 2025-09-07T07:51:36.1188737Z * [new branch] gh/fegin/307/orig -> origin/gh/fegin/307/orig 2025-09-07T07:51:36.1190947Z * [new branch] gh/fegin/308/base -> origin/gh/fegin/308/base 2025-09-07T07:51:36.1192583Z * [new branch] gh/fegin/308/head -> origin/gh/fegin/308/head 2025-09-07T07:51:36.1194164Z * [new branch] gh/fegin/308/orig -> origin/gh/fegin/308/orig 2025-09-07T07:51:36.1196954Z * [new branch] gh/fegin/309/base -> origin/gh/fegin/309/base 2025-09-07T07:51:36.1198349Z * [new branch] gh/fegin/309/head -> origin/gh/fegin/309/head 2025-09-07T07:51:36.1199914Z * [new branch] gh/fegin/309/orig -> origin/gh/fegin/309/orig 2025-09-07T07:51:36.1202119Z * [new branch] gh/fegin/310/base -> origin/gh/fegin/310/base 2025-09-07T07:51:36.1203800Z * [new branch] gh/fegin/310/head -> origin/gh/fegin/310/head 2025-09-07T07:51:36.1205749Z * [new branch] gh/fegin/310/orig -> origin/gh/fegin/310/orig 2025-09-07T07:51:36.1207974Z * [new branch] gh/fegin/311/base -> origin/gh/fegin/311/base 2025-09-07T07:51:36.1209591Z * [new branch] gh/fegin/311/head -> origin/gh/fegin/311/head 2025-09-07T07:51:36.1211230Z * [new branch] gh/fegin/311/orig -> origin/gh/fegin/311/orig 2025-09-07T07:51:36.1213324Z * [new branch] gh/fegin/312/base -> origin/gh/fegin/312/base 2025-09-07T07:51:36.1214900Z * [new branch] gh/fegin/312/head -> origin/gh/fegin/312/head 2025-09-07T07:51:36.1216848Z * [new branch] gh/fegin/312/orig -> origin/gh/fegin/312/orig 2025-09-07T07:51:36.1219060Z * [new branch] gh/fegin/313/base -> origin/gh/fegin/313/base 2025-09-07T07:51:36.1220604Z * [new branch] gh/fegin/313/head -> origin/gh/fegin/313/head 2025-09-07T07:51:36.1222310Z * [new branch] gh/fegin/313/orig -> origin/gh/fegin/313/orig 2025-09-07T07:51:36.1225322Z * [new branch] gh/fffrog/124/base -> origin/gh/fffrog/124/base 2025-09-07T07:51:36.1227017Z * [new branch] gh/fffrog/124/head -> origin/gh/fffrog/124/head 2025-09-07T07:51:36.1228627Z * [new branch] gh/fffrog/124/orig -> origin/gh/fffrog/124/orig 2025-09-07T07:51:36.1230913Z * [new branch] gh/fffrog/129/base -> origin/gh/fffrog/129/base 2025-09-07T07:51:36.1232414Z * [new branch] gh/fffrog/129/head -> origin/gh/fffrog/129/head 2025-09-07T07:51:36.1233978Z * [new branch] gh/fffrog/129/orig -> origin/gh/fffrog/129/orig 2025-09-07T07:51:36.1236612Z * [new branch] gh/fffrog/130/base -> origin/gh/fffrog/130/base 2025-09-07T07:51:36.1238107Z * [new branch] gh/fffrog/130/head -> origin/gh/fffrog/130/head 2025-09-07T07:51:36.1239711Z * [new branch] gh/fffrog/130/orig -> origin/gh/fffrog/130/orig 2025-09-07T07:51:36.1242015Z * [new branch] gh/fffrog/131/base -> origin/gh/fffrog/131/base 2025-09-07T07:51:36.1243635Z * [new branch] gh/fffrog/131/head -> origin/gh/fffrog/131/head 2025-09-07T07:51:36.1245434Z * [new branch] gh/fffrog/131/orig -> origin/gh/fffrog/131/orig 2025-09-07T07:51:36.1247868Z * [new branch] gh/fffrog/132/base -> origin/gh/fffrog/132/base 2025-09-07T07:51:36.1249459Z * [new branch] gh/fffrog/132/head -> origin/gh/fffrog/132/head 2025-09-07T07:51:36.1250993Z * [new branch] gh/fffrog/132/orig -> origin/gh/fffrog/132/orig 2025-09-07T07:51:36.1253406Z * [new branch] gh/fffrog/133/base -> origin/gh/fffrog/133/base 2025-09-07T07:51:36.1254840Z * [new branch] gh/fffrog/133/head -> origin/gh/fffrog/133/head 2025-09-07T07:51:36.1256707Z * [new branch] gh/fffrog/133/orig -> origin/gh/fffrog/133/orig 2025-09-07T07:51:36.1259005Z * [new branch] gh/fffrog/134/base -> origin/gh/fffrog/134/base 2025-09-07T07:51:36.1260540Z * [new branch] gh/fffrog/134/head -> origin/gh/fffrog/134/head 2025-09-07T07:51:36.1262397Z * [new branch] gh/fffrog/134/orig -> origin/gh/fffrog/134/orig 2025-09-07T07:51:36.1264591Z * [new branch] gh/fffrog/135/base -> origin/gh/fffrog/135/base 2025-09-07T07:51:36.1266443Z * [new branch] gh/fffrog/135/head -> origin/gh/fffrog/135/head 2025-09-07T07:51:36.1268241Z * [new branch] gh/fffrog/135/orig -> origin/gh/fffrog/135/orig 2025-09-07T07:51:36.1270274Z * [new branch] gh/fffrog/136/base -> origin/gh/fffrog/136/base 2025-09-07T07:51:36.1273354Z * [new branch] gh/fffrog/136/head -> origin/gh/fffrog/136/head 2025-09-07T07:51:36.1274056Z * [new branch] gh/fffrog/136/orig -> origin/gh/fffrog/136/orig 2025-09-07T07:51:36.1276033Z * [new branch] gh/fffrog/137/base -> origin/gh/fffrog/137/base 2025-09-07T07:51:36.1277411Z * [new branch] gh/fffrog/137/head -> origin/gh/fffrog/137/head 2025-09-07T07:51:36.1279193Z * [new branch] gh/fffrog/137/orig -> origin/gh/fffrog/137/orig 2025-09-07T07:51:36.1281459Z * [new branch] gh/fffrog/138/base -> origin/gh/fffrog/138/base 2025-09-07T07:51:36.1283091Z * [new branch] gh/fffrog/138/head -> origin/gh/fffrog/138/head 2025-09-07T07:51:36.1284634Z * [new branch] gh/fffrog/138/orig -> origin/gh/fffrog/138/orig 2025-09-07T07:51:36.1287190Z * [new branch] gh/fffrog/139/base -> origin/gh/fffrog/139/base 2025-09-07T07:51:36.1288731Z * [new branch] gh/fffrog/139/head -> origin/gh/fffrog/139/head 2025-09-07T07:51:36.1290338Z * [new branch] gh/fffrog/139/orig -> origin/gh/fffrog/139/orig 2025-09-07T07:51:36.1292558Z * [new branch] gh/fffrog/140/base -> origin/gh/fffrog/140/base 2025-09-07T07:51:36.1294123Z * [new branch] gh/fffrog/140/head -> origin/gh/fffrog/140/head 2025-09-07T07:51:36.1295933Z * [new branch] gh/fffrog/140/orig -> origin/gh/fffrog/140/orig 2025-09-07T07:51:36.1298211Z * [new branch] gh/fffrog/141/base -> origin/gh/fffrog/141/base 2025-09-07T07:51:36.1299701Z * [new branch] gh/fffrog/141/head -> origin/gh/fffrog/141/head 2025-09-07T07:51:36.1301237Z * [new branch] gh/fffrog/141/orig -> origin/gh/fffrog/141/orig 2025-09-07T07:51:36.1303831Z * [new branch] gh/fffrog/142/base -> origin/gh/fffrog/142/base 2025-09-07T07:51:36.1305503Z * [new branch] gh/fffrog/142/head -> origin/gh/fffrog/142/head 2025-09-07T07:51:36.1307084Z * [new branch] gh/fffrog/142/orig -> origin/gh/fffrog/142/orig 2025-09-07T07:51:36.1309419Z * [new branch] gh/fffrog/143/base -> origin/gh/fffrog/143/base 2025-09-07T07:51:36.1311070Z * [new branch] gh/fffrog/143/head -> origin/gh/fffrog/143/head 2025-09-07T07:51:36.1312626Z * [new branch] gh/fffrog/143/orig -> origin/gh/fffrog/143/orig 2025-09-07T07:51:36.1315694Z * [new branch] gh/fffrog/144/base -> origin/gh/fffrog/144/base 2025-09-07T07:51:36.1316956Z * [new branch] gh/fffrog/144/head -> origin/gh/fffrog/144/head 2025-09-07T07:51:36.1318544Z * [new branch] gh/fffrog/144/orig -> origin/gh/fffrog/144/orig 2025-09-07T07:51:36.1320889Z * [new branch] gh/fffrog/145/base -> origin/gh/fffrog/145/base 2025-09-07T07:51:36.1322309Z * [new branch] gh/fffrog/145/head -> origin/gh/fffrog/145/head 2025-09-07T07:51:36.1323816Z * [new branch] gh/fffrog/145/orig -> origin/gh/fffrog/145/orig 2025-09-07T07:51:36.1326689Z * [new branch] gh/fffrog/146/base -> origin/gh/fffrog/146/base 2025-09-07T07:51:36.1328279Z * [new branch] gh/fffrog/146/head -> origin/gh/fffrog/146/head 2025-09-07T07:51:36.1329750Z * [new branch] gh/fffrog/146/orig -> origin/gh/fffrog/146/orig 2025-09-07T07:51:36.1332043Z * [new branch] gh/fffrog/147/base -> origin/gh/fffrog/147/base 2025-09-07T07:51:36.1333683Z * [new branch] gh/fffrog/147/head -> origin/gh/fffrog/147/head 2025-09-07T07:51:36.1335450Z * [new branch] gh/fffrog/147/orig -> origin/gh/fffrog/147/orig 2025-09-07T07:51:36.1337845Z * [new branch] gh/fffrog/148/base -> origin/gh/fffrog/148/base 2025-09-07T07:51:36.1339399Z * [new branch] gh/fffrog/148/head -> origin/gh/fffrog/148/head 2025-09-07T07:51:36.1340918Z * [new branch] gh/fffrog/148/orig -> origin/gh/fffrog/148/orig 2025-09-07T07:51:36.1343333Z * [new branch] gh/fffrog/149/base -> origin/gh/fffrog/149/base 2025-09-07T07:51:36.1344879Z * [new branch] gh/fffrog/149/head -> origin/gh/fffrog/149/head 2025-09-07T07:51:36.1346824Z * [new branch] gh/fffrog/149/orig -> origin/gh/fffrog/149/orig 2025-09-07T07:51:36.1348988Z * [new branch] gh/fffrog/150/base -> origin/gh/fffrog/150/base 2025-09-07T07:51:36.1350571Z * [new branch] gh/fffrog/150/head -> origin/gh/fffrog/150/head 2025-09-07T07:51:36.1352150Z * [new branch] gh/fffrog/150/orig -> origin/gh/fffrog/150/orig 2025-09-07T07:51:36.1354484Z * [new branch] gh/fffrog/151/base -> origin/gh/fffrog/151/base 2025-09-07T07:51:36.1356411Z * [new branch] gh/fffrog/151/head -> origin/gh/fffrog/151/head 2025-09-07T07:51:36.1358018Z * [new branch] gh/fffrog/151/orig -> origin/gh/fffrog/151/orig 2025-09-07T07:51:36.1360301Z * [new branch] gh/fffrog/152/base -> origin/gh/fffrog/152/base 2025-09-07T07:51:36.1361860Z * [new branch] gh/fffrog/152/head -> origin/gh/fffrog/152/head 2025-09-07T07:51:36.1364095Z * [new branch] gh/fffrog/153/base -> origin/gh/fffrog/153/base 2025-09-07T07:51:36.1365991Z * [new branch] gh/fffrog/153/head -> origin/gh/fffrog/153/head 2025-09-07T07:51:36.1367515Z * [new branch] gh/fffrog/153/orig -> origin/gh/fffrog/153/orig 2025-09-07T07:51:36.1370390Z * [new branch] gh/gmagogsfm/1/base -> origin/gh/gmagogsfm/1/base 2025-09-07T07:51:36.1371936Z * [new branch] gh/gmagogsfm/1/head -> origin/gh/gmagogsfm/1/head 2025-09-07T07:51:36.1373725Z * [new branch] gh/gmagogsfm/1/orig -> origin/gh/gmagogsfm/1/orig 2025-09-07T07:51:36.1376097Z * [new branch] gh/gmagogsfm/2/base -> origin/gh/gmagogsfm/2/base 2025-09-07T07:51:36.1377632Z * [new branch] gh/gmagogsfm/2/head -> origin/gh/gmagogsfm/2/head 2025-09-07T07:51:36.1379177Z * [new branch] gh/gmagogsfm/2/orig -> origin/gh/gmagogsfm/2/orig 2025-09-07T07:51:36.1381379Z * [new branch] gh/gmagogsfm/3/base -> origin/gh/gmagogsfm/3/base 2025-09-07T07:51:36.1383198Z * [new branch] gh/gmagogsfm/3/head -> origin/gh/gmagogsfm/3/head 2025-09-07T07:51:36.1384727Z * [new branch] gh/gmagogsfm/3/orig -> origin/gh/gmagogsfm/3/orig 2025-09-07T07:51:36.1388038Z * [new branch] gh/guangyey/134/base -> origin/gh/guangyey/134/base 2025-09-07T07:51:36.1389431Z * [new branch] gh/guangyey/134/head -> origin/gh/guangyey/134/head 2025-09-07T07:51:36.1391109Z * [new branch] gh/guangyey/134/orig -> origin/gh/guangyey/134/orig 2025-09-07T07:51:36.1393286Z * [new branch] gh/guangyey/135/base -> origin/gh/guangyey/135/base 2025-09-07T07:51:36.1394880Z * [new branch] gh/guangyey/135/head -> origin/gh/guangyey/135/head 2025-09-07T07:51:36.1396764Z * [new branch] gh/guangyey/135/orig -> origin/gh/guangyey/135/orig 2025-09-07T07:51:36.1398901Z * [new branch] gh/guangyey/139/base -> origin/gh/guangyey/139/base 2025-09-07T07:51:36.1400510Z * [new branch] gh/guangyey/139/head -> origin/gh/guangyey/139/head 2025-09-07T07:51:36.1402092Z * [new branch] gh/guangyey/139/orig -> origin/gh/guangyey/139/orig 2025-09-07T07:51:36.1404373Z * [new branch] gh/guangyey/140/base -> origin/gh/guangyey/140/base 2025-09-07T07:51:36.1406347Z * [new branch] gh/guangyey/140/head -> origin/gh/guangyey/140/head 2025-09-07T07:51:36.1407919Z * [new branch] gh/guangyey/140/orig -> origin/gh/guangyey/140/orig 2025-09-07T07:51:36.1410216Z * [new branch] gh/guangyey/142/base -> origin/gh/guangyey/142/base 2025-09-07T07:51:36.1411775Z * [new branch] gh/guangyey/142/head -> origin/gh/guangyey/142/head 2025-09-07T07:51:36.1413339Z * [new branch] gh/guangyey/142/orig -> origin/gh/guangyey/142/orig 2025-09-07T07:51:36.1415833Z * [new branch] gh/guangyey/145/base -> origin/gh/guangyey/145/base 2025-09-07T07:51:36.1417406Z * [new branch] gh/guangyey/145/head -> origin/gh/guangyey/145/head 2025-09-07T07:51:36.1418931Z * [new branch] gh/guangyey/145/orig -> origin/gh/guangyey/145/orig 2025-09-07T07:51:36.1421144Z * [new branch] gh/guangyey/153/base -> origin/gh/guangyey/153/base 2025-09-07T07:51:36.1423035Z * [new branch] gh/guangyey/153/head -> origin/gh/guangyey/153/head 2025-09-07T07:51:36.1424496Z * [new branch] gh/guangyey/153/orig -> origin/gh/guangyey/153/orig 2025-09-07T07:51:36.1427024Z * [new branch] gh/guangyey/159/base -> origin/gh/guangyey/159/base 2025-09-07T07:51:36.1428607Z * [new branch] gh/guangyey/159/head -> origin/gh/guangyey/159/head 2025-09-07T07:51:36.1430166Z * [new branch] gh/guangyey/159/orig -> origin/gh/guangyey/159/orig 2025-09-07T07:51:36.1432399Z * [new branch] gh/guangyey/163/base -> origin/gh/guangyey/163/base 2025-09-07T07:51:36.1434172Z * [new branch] gh/guangyey/163/head -> origin/gh/guangyey/163/head 2025-09-07T07:51:36.1436088Z * [new branch] gh/guangyey/163/orig -> origin/gh/guangyey/163/orig 2025-09-07T07:51:36.1454538Z * [new branch] gh/guangyey/168/base -> origin/gh/guangyey/168/base 2025-09-07T07:51:36.1455289Z * [new branch] gh/guangyey/168/head -> origin/gh/guangyey/168/head 2025-09-07T07:51:36.1455722Z * [new branch] gh/guangyey/168/orig -> origin/gh/guangyey/168/orig 2025-09-07T07:51:36.1456155Z * [new branch] gh/guangyey/169/base -> origin/gh/guangyey/169/base 2025-09-07T07:51:36.1456556Z * [new branch] gh/guangyey/169/head -> origin/gh/guangyey/169/head 2025-09-07T07:51:36.1456955Z * [new branch] gh/guangyey/169/orig -> origin/gh/guangyey/169/orig 2025-09-07T07:51:36.1457351Z * [new branch] gh/guangyey/170/base -> origin/gh/guangyey/170/base 2025-09-07T07:51:36.1457748Z * [new branch] gh/guangyey/170/head -> origin/gh/guangyey/170/head 2025-09-07T07:51:36.1458388Z * [new branch] gh/guangyey/170/orig -> origin/gh/guangyey/170/orig 2025-09-07T07:51:36.1458779Z * [new branch] gh/guangyey/171/base -> origin/gh/guangyey/171/base 2025-09-07T07:51:36.1459163Z * [new branch] gh/guangyey/171/head -> origin/gh/guangyey/171/head 2025-09-07T07:51:36.1459530Z * [new branch] gh/guangyey/171/orig -> origin/gh/guangyey/171/orig 2025-09-07T07:51:36.1460713Z * [new branch] gh/guangyey/174/base -> origin/gh/guangyey/174/base 2025-09-07T07:51:36.1462280Z * [new branch] gh/guangyey/174/head -> origin/gh/guangyey/174/head 2025-09-07T07:51:36.1463820Z * [new branch] gh/guangyey/174/orig -> origin/gh/guangyey/174/orig 2025-09-07T07:51:36.1466693Z * [new branch] gh/guangyey/176/base -> origin/gh/guangyey/176/base 2025-09-07T07:51:36.1468217Z * [new branch] gh/guangyey/176/head -> origin/gh/guangyey/176/head 2025-09-07T07:51:36.1469807Z * [new branch] gh/guangyey/176/orig -> origin/gh/guangyey/176/orig 2025-09-07T07:51:36.1472066Z * [new branch] gh/guangyey/178/base -> origin/gh/guangyey/178/base 2025-09-07T07:51:36.1473619Z * [new branch] gh/guangyey/178/head -> origin/gh/guangyey/178/head 2025-09-07T07:51:36.1475351Z * [new branch] gh/guangyey/178/orig -> origin/gh/guangyey/178/orig 2025-09-07T07:51:36.1477744Z * [new branch] gh/guangyey/181/base -> origin/gh/guangyey/181/base 2025-09-07T07:51:36.1479299Z * [new branch] gh/guangyey/181/head -> origin/gh/guangyey/181/head 2025-09-07T07:51:36.1480854Z * [new branch] gh/guangyey/181/orig -> origin/gh/guangyey/181/orig 2025-09-07T07:51:36.1483096Z * [new branch] gh/guangyey/182/base -> origin/gh/guangyey/182/base 2025-09-07T07:51:36.1484660Z * [new branch] gh/guangyey/182/head -> origin/gh/guangyey/182/head 2025-09-07T07:51:36.1486604Z * [new branch] gh/guangyey/182/orig -> origin/gh/guangyey/182/orig 2025-09-07T07:51:36.1488853Z * [new branch] gh/guangyey/183/base -> origin/gh/guangyey/183/base 2025-09-07T07:51:36.1490386Z * [new branch] gh/guangyey/183/head -> origin/gh/guangyey/183/head 2025-09-07T07:51:36.1492584Z * [new branch] gh/guangyey/183/orig -> origin/gh/guangyey/183/orig 2025-09-07T07:51:36.1494854Z * [new branch] gh/guangyey/184/base -> origin/gh/guangyey/184/base 2025-09-07T07:51:36.1496903Z * [new branch] gh/guangyey/184/head -> origin/gh/guangyey/184/head 2025-09-07T07:51:36.1498371Z * [new branch] gh/guangyey/184/orig -> origin/gh/guangyey/184/orig 2025-09-07T07:51:36.1500593Z * [new branch] gh/guangyey/185/base -> origin/gh/guangyey/185/base 2025-09-07T07:51:36.1502312Z * [new branch] gh/guangyey/185/head -> origin/gh/guangyey/185/head 2025-09-07T07:51:36.1503899Z * [new branch] gh/guangyey/185/orig -> origin/gh/guangyey/185/orig 2025-09-07T07:51:36.1506625Z * [new branch] gh/guangyey/186/base -> origin/gh/guangyey/186/base 2025-09-07T07:51:36.1508184Z * [new branch] gh/guangyey/186/head -> origin/gh/guangyey/186/head 2025-09-07T07:51:36.1509698Z * [new branch] gh/guangyey/186/orig -> origin/gh/guangyey/186/orig 2025-09-07T07:51:36.1511991Z * [new branch] gh/guangyey/187/base -> origin/gh/guangyey/187/base 2025-09-07T07:51:36.1513535Z * [new branch] gh/guangyey/187/head -> origin/gh/guangyey/187/head 2025-09-07T07:51:36.1515268Z * [new branch] gh/guangyey/187/orig -> origin/gh/guangyey/187/orig 2025-09-07T07:51:36.1517820Z * [new branch] gh/guangyey/188/base -> origin/gh/guangyey/188/base 2025-09-07T07:51:36.1519468Z * [new branch] gh/guangyey/188/head -> origin/gh/guangyey/188/head 2025-09-07T07:51:36.1520815Z * [new branch] gh/guangyey/188/orig -> origin/gh/guangyey/188/orig 2025-09-07T07:51:36.1523237Z * [new branch] gh/guangyey/189/base -> origin/gh/guangyey/189/base 2025-09-07T07:51:36.1524803Z * [new branch] gh/guangyey/189/head -> origin/gh/guangyey/189/head 2025-09-07T07:51:36.1526903Z * [new branch] gh/guangyey/189/orig -> origin/gh/guangyey/189/orig 2025-09-07T07:51:36.1529142Z * [new branch] gh/guangyey/190/base -> origin/gh/guangyey/190/base 2025-09-07T07:51:36.1530713Z * [new branch] gh/guangyey/190/head -> origin/gh/guangyey/190/head 2025-09-07T07:51:36.1532285Z * [new branch] gh/guangyey/190/orig -> origin/gh/guangyey/190/orig 2025-09-07T07:51:36.1534519Z * [new branch] gh/guangyey/191/base -> origin/gh/guangyey/191/base 2025-09-07T07:51:36.1536439Z * [new branch] gh/guangyey/191/head -> origin/gh/guangyey/191/head 2025-09-07T07:51:36.1537991Z * [new branch] gh/guangyey/191/orig -> origin/gh/guangyey/191/orig 2025-09-07T07:51:36.1540218Z * [new branch] gh/guangyey/192/base -> origin/gh/guangyey/192/base 2025-09-07T07:51:36.1541900Z * [new branch] gh/guangyey/192/head -> origin/gh/guangyey/192/head 2025-09-07T07:51:36.1543588Z * [new branch] gh/guangyey/192/orig -> origin/gh/guangyey/192/orig 2025-09-07T07:51:36.1546195Z * [new branch] gh/guangyey/193/base -> origin/gh/guangyey/193/base 2025-09-07T07:51:36.1547782Z * [new branch] gh/guangyey/193/head -> origin/gh/guangyey/193/head 2025-09-07T07:51:36.1549383Z * [new branch] gh/guangyey/193/orig -> origin/gh/guangyey/193/orig 2025-09-07T07:51:36.1551685Z * [new branch] gh/guangyey/194/base -> origin/gh/guangyey/194/base 2025-09-07T07:51:36.1553334Z * [new branch] gh/guangyey/194/head -> origin/gh/guangyey/194/head 2025-09-07T07:51:36.1554911Z * [new branch] gh/guangyey/194/orig -> origin/gh/guangyey/194/orig 2025-09-07T07:51:36.1557502Z * [new branch] gh/guangyey/195/base -> origin/gh/guangyey/195/base 2025-09-07T07:51:36.1559152Z * [new branch] gh/guangyey/195/head -> origin/gh/guangyey/195/head 2025-09-07T07:51:36.1560737Z * [new branch] gh/guangyey/195/orig -> origin/gh/guangyey/195/orig 2025-09-07T07:51:36.1563230Z * [new branch] gh/guangyey/196/base -> origin/gh/guangyey/196/base 2025-09-07T07:51:36.1564786Z * [new branch] gh/guangyey/196/head -> origin/gh/guangyey/196/head 2025-09-07T07:51:36.1566751Z * [new branch] gh/guangyey/196/orig -> origin/gh/guangyey/196/orig 2025-09-07T07:51:36.1569255Z * [new branch] gh/guangyey/197/base -> origin/gh/guangyey/197/base 2025-09-07T07:51:36.1570609Z * [new branch] gh/guangyey/197/head -> origin/gh/guangyey/197/head 2025-09-07T07:51:36.1572227Z * [new branch] gh/guangyey/197/orig -> origin/gh/guangyey/197/orig 2025-09-07T07:51:36.1574549Z * [new branch] gh/guangyey/198/base -> origin/gh/guangyey/198/base 2025-09-07T07:51:36.1576378Z * [new branch] gh/guangyey/198/head -> origin/gh/guangyey/198/head 2025-09-07T07:51:36.1577964Z * [new branch] gh/guangyey/198/orig -> origin/gh/guangyey/198/orig 2025-09-07T07:51:36.1580546Z * [new branch] gh/guangyey/199/base -> origin/gh/guangyey/199/base 2025-09-07T07:51:36.1582034Z * [new branch] gh/guangyey/199/head -> origin/gh/guangyey/199/head 2025-09-07T07:51:36.1583594Z * [new branch] gh/guangyey/199/orig -> origin/gh/guangyey/199/orig 2025-09-07T07:51:36.1586392Z * [new branch] gh/guangyey/200/base -> origin/gh/guangyey/200/base 2025-09-07T07:51:36.1587802Z * [new branch] gh/guangyey/200/head -> origin/gh/guangyey/200/head 2025-09-07T07:51:36.1589345Z * [new branch] gh/guangyey/200/orig -> origin/gh/guangyey/200/orig 2025-09-07T07:51:36.1591686Z * [new branch] gh/guangyey/201/base -> origin/gh/guangyey/201/base 2025-09-07T07:51:36.1593595Z * [new branch] gh/guangyey/201/head -> origin/gh/guangyey/201/head 2025-09-07T07:51:36.1595162Z * [new branch] gh/guangyey/201/orig -> origin/gh/guangyey/201/orig 2025-09-07T07:51:36.1597554Z * [new branch] gh/guangyey/202/base -> origin/gh/guangyey/202/base 2025-09-07T07:51:36.1599093Z * [new branch] gh/guangyey/202/head -> origin/gh/guangyey/202/head 2025-09-07T07:51:36.1600743Z * [new branch] gh/guangyey/202/orig -> origin/gh/guangyey/202/orig 2025-09-07T07:51:36.1603008Z * [new branch] gh/guangyey/203/base -> origin/gh/guangyey/203/base 2025-09-07T07:51:36.1604554Z * [new branch] gh/guangyey/203/head -> origin/gh/guangyey/203/head 2025-09-07T07:51:36.1606535Z * [new branch] gh/guangyey/203/orig -> origin/gh/guangyey/203/orig 2025-09-07T07:51:36.1608812Z * [new branch] gh/guangyey/204/base -> origin/gh/guangyey/204/base 2025-09-07T07:51:36.1610365Z * [new branch] gh/guangyey/204/head -> origin/gh/guangyey/204/head 2025-09-07T07:51:36.1611874Z * [new branch] gh/guangyey/204/orig -> origin/gh/guangyey/204/orig 2025-09-07T07:51:36.1614175Z * [new branch] gh/guangyey/205/base -> origin/gh/guangyey/205/base 2025-09-07T07:51:36.1615992Z * [new branch] gh/guangyey/205/head -> origin/gh/guangyey/205/head 2025-09-07T07:51:36.1617587Z * [new branch] gh/guangyey/205/orig -> origin/gh/guangyey/205/orig 2025-09-07T07:51:36.1619976Z * [new branch] gh/guangyey/206/base -> origin/gh/guangyey/206/base 2025-09-07T07:51:36.1621633Z * [new branch] gh/guangyey/206/head -> origin/gh/guangyey/206/head 2025-09-07T07:51:36.1623289Z * [new branch] gh/guangyey/206/orig -> origin/gh/guangyey/206/orig 2025-09-07T07:51:36.1625845Z * [new branch] gh/guangyey/207/base -> origin/gh/guangyey/207/base 2025-09-07T07:51:36.1627434Z * [new branch] gh/guangyey/207/head -> origin/gh/guangyey/207/head 2025-09-07T07:51:36.1629076Z * [new branch] gh/guangyey/207/orig -> origin/gh/guangyey/207/orig 2025-09-07T07:51:36.1631366Z * [new branch] gh/guangyey/79/base -> origin/gh/guangyey/79/base 2025-09-07T07:51:36.1632870Z * [new branch] gh/guangyey/79/head -> origin/gh/guangyey/79/head 2025-09-07T07:51:36.1634425Z * [new branch] gh/guangyey/79/orig -> origin/gh/guangyey/79/orig 2025-09-07T07:51:36.1637121Z * [new branch] gh/guangyey/89/base -> origin/gh/guangyey/89/base 2025-09-07T07:51:36.1638757Z * [new branch] gh/guangyey/89/head -> origin/gh/guangyey/89/head 2025-09-07T07:51:36.1640524Z * [new branch] gh/guangyey/89/orig -> origin/gh/guangyey/89/orig 2025-09-07T07:51:36.1643177Z * [new branch] gh/guilhermeleobas/107/base -> origin/gh/guilhermeleobas/107/base 2025-09-07T07:51:36.1644753Z * [new branch] gh/guilhermeleobas/107/head -> origin/gh/guilhermeleobas/107/head 2025-09-07T07:51:36.1646733Z * [new branch] gh/guilhermeleobas/107/orig -> origin/gh/guilhermeleobas/107/orig 2025-09-07T07:51:36.1648854Z * [new branch] gh/guilhermeleobas/108/base -> origin/gh/guilhermeleobas/108/base 2025-09-07T07:51:36.1650484Z * [new branch] gh/guilhermeleobas/108/head -> origin/gh/guilhermeleobas/108/head 2025-09-07T07:51:36.1652249Z * [new branch] gh/guilhermeleobas/108/orig -> origin/gh/guilhermeleobas/108/orig 2025-09-07T07:51:36.1654373Z * [new branch] gh/guilhermeleobas/124/base -> origin/gh/guilhermeleobas/124/base 2025-09-07T07:51:36.1656291Z * [new branch] gh/guilhermeleobas/124/head -> origin/gh/guilhermeleobas/124/head 2025-09-07T07:51:36.1658114Z * [new branch] gh/guilhermeleobas/124/orig -> origin/gh/guilhermeleobas/124/orig 2025-09-07T07:51:36.1660350Z * [new branch] gh/guilhermeleobas/147/base -> origin/gh/guilhermeleobas/147/base 2025-09-07T07:51:36.1662050Z * [new branch] gh/guilhermeleobas/147/head -> origin/gh/guilhermeleobas/147/head 2025-09-07T07:51:36.1663726Z * [new branch] gh/guilhermeleobas/147/orig -> origin/gh/guilhermeleobas/147/orig 2025-09-07T07:51:36.1666260Z * [new branch] gh/guilhermeleobas/150/base -> origin/gh/guilhermeleobas/150/base 2025-09-07T07:51:36.1667861Z * [new branch] gh/guilhermeleobas/150/head -> origin/gh/guilhermeleobas/150/head 2025-09-07T07:51:36.1669668Z * [new branch] gh/guilhermeleobas/150/orig -> origin/gh/guilhermeleobas/150/orig 2025-09-07T07:51:36.1671738Z * [new branch] gh/guilhermeleobas/163/base -> origin/gh/guilhermeleobas/163/base 2025-09-07T07:51:36.1673287Z * [new branch] gh/guilhermeleobas/163/head -> origin/gh/guilhermeleobas/163/head 2025-09-07T07:51:36.1674866Z * [new branch] gh/guilhermeleobas/163/orig -> origin/gh/guilhermeleobas/163/orig 2025-09-07T07:51:36.1677506Z * [new branch] gh/guilhermeleobas/164/base -> origin/gh/guilhermeleobas/164/base 2025-09-07T07:51:36.1679057Z * [new branch] gh/guilhermeleobas/164/head -> origin/gh/guilhermeleobas/164/head 2025-09-07T07:51:36.1680606Z * [new branch] gh/guilhermeleobas/164/orig -> origin/gh/guilhermeleobas/164/orig 2025-09-07T07:51:36.1682859Z * [new branch] gh/guilhermeleobas/165/base -> origin/gh/guilhermeleobas/165/base 2025-09-07T07:51:36.1684637Z * [new branch] gh/guilhermeleobas/165/head -> origin/gh/guilhermeleobas/165/head 2025-09-07T07:51:36.1686572Z * [new branch] gh/guilhermeleobas/165/orig -> origin/gh/guilhermeleobas/165/orig 2025-09-07T07:51:36.1688837Z * [new branch] gh/guilhermeleobas/166/base -> origin/gh/guilhermeleobas/166/base 2025-09-07T07:51:36.1690448Z * [new branch] gh/guilhermeleobas/166/head -> origin/gh/guilhermeleobas/166/head 2025-09-07T07:51:36.1692027Z * [new branch] gh/guilhermeleobas/166/orig -> origin/gh/guilhermeleobas/166/orig 2025-09-07T07:51:36.1694238Z * [new branch] gh/guilhermeleobas/167/base -> origin/gh/guilhermeleobas/167/base 2025-09-07T07:51:36.1696224Z * [new branch] gh/guilhermeleobas/167/head -> origin/gh/guilhermeleobas/167/head 2025-09-07T07:51:36.1697785Z * [new branch] gh/guilhermeleobas/167/orig -> origin/gh/guilhermeleobas/167/orig 2025-09-07T07:51:36.1700090Z * [new branch] gh/guilhermeleobas/168/base -> origin/gh/guilhermeleobas/168/base 2025-09-07T07:51:36.1701810Z * [new branch] gh/guilhermeleobas/168/head -> origin/gh/guilhermeleobas/168/head 2025-09-07T07:51:36.1703342Z * [new branch] gh/guilhermeleobas/168/orig -> origin/gh/guilhermeleobas/168/orig 2025-09-07T07:51:36.1705930Z * [new branch] gh/guilhermeleobas/169/base -> origin/gh/guilhermeleobas/169/base 2025-09-07T07:51:36.1707578Z * [new branch] gh/guilhermeleobas/169/head -> origin/gh/guilhermeleobas/169/head 2025-09-07T07:51:36.1709178Z * [new branch] gh/guilhermeleobas/169/orig -> origin/gh/guilhermeleobas/169/orig 2025-09-07T07:51:36.1711461Z * [new branch] gh/guilhermeleobas/170/base -> origin/gh/guilhermeleobas/170/base 2025-09-07T07:51:36.1712967Z * [new branch] gh/guilhermeleobas/170/head -> origin/gh/guilhermeleobas/170/head 2025-09-07T07:51:36.1714739Z * [new branch] gh/guilhermeleobas/170/orig -> origin/gh/guilhermeleobas/170/orig 2025-09-07T07:51:36.1717268Z * [new branch] gh/guilhermeleobas/171/base -> origin/gh/guilhermeleobas/171/base 2025-09-07T07:51:36.1718847Z * [new branch] gh/guilhermeleobas/171/head -> origin/gh/guilhermeleobas/171/head 2025-09-07T07:51:36.1720312Z * [new branch] gh/guilhermeleobas/171/orig -> origin/gh/guilhermeleobas/171/orig 2025-09-07T07:51:36.1722535Z * [new branch] gh/guilhermeleobas/173/base -> origin/gh/guilhermeleobas/173/base 2025-09-07T07:51:36.1724075Z * [new branch] gh/guilhermeleobas/173/head -> origin/gh/guilhermeleobas/173/head 2025-09-07T07:51:36.1726122Z * [new branch] gh/guilhermeleobas/173/orig -> origin/gh/guilhermeleobas/173/orig 2025-09-07T07:51:36.1728369Z * [new branch] gh/guilhermeleobas/192/base -> origin/gh/guilhermeleobas/192/base 2025-09-07T07:51:36.1730059Z * [new branch] gh/guilhermeleobas/192/head -> origin/gh/guilhermeleobas/192/head 2025-09-07T07:51:36.1731595Z * [new branch] gh/guilhermeleobas/192/orig -> origin/gh/guilhermeleobas/192/orig 2025-09-07T07:51:36.1733896Z * [new branch] gh/guilhermeleobas/193/base -> origin/gh/guilhermeleobas/193/base 2025-09-07T07:51:36.1735749Z * [new branch] gh/guilhermeleobas/193/head -> origin/gh/guilhermeleobas/193/head 2025-09-07T07:51:36.1737376Z * [new branch] gh/guilhermeleobas/193/orig -> origin/gh/guilhermeleobas/193/orig 2025-09-07T07:51:36.1739600Z * [new branch] gh/guilhermeleobas/194/base -> origin/gh/guilhermeleobas/194/base 2025-09-07T07:51:36.1741217Z * [new branch] gh/guilhermeleobas/194/head -> origin/gh/guilhermeleobas/194/head 2025-09-07T07:51:36.1742920Z * [new branch] gh/guilhermeleobas/194/orig -> origin/gh/guilhermeleobas/194/orig 2025-09-07T07:51:36.1745331Z * [new branch] gh/guilhermeleobas/203/base -> origin/gh/guilhermeleobas/203/base 2025-09-07T07:51:36.1747040Z * [new branch] gh/guilhermeleobas/203/head -> origin/gh/guilhermeleobas/203/head 2025-09-07T07:51:36.1748633Z * [new branch] gh/guilhermeleobas/203/orig -> origin/gh/guilhermeleobas/203/orig 2025-09-07T07:51:36.1750838Z * [new branch] gh/guilhermeleobas/204/base -> origin/gh/guilhermeleobas/204/base 2025-09-07T07:51:36.1752695Z * [new branch] gh/guilhermeleobas/204/head -> origin/gh/guilhermeleobas/204/head 2025-09-07T07:51:36.1754258Z * [new branch] gh/guilhermeleobas/204/orig -> origin/gh/guilhermeleobas/204/orig 2025-09-07T07:51:36.1756845Z * [new branch] gh/guilhermeleobas/205/base -> origin/gh/guilhermeleobas/205/base 2025-09-07T07:51:36.1758318Z * [new branch] gh/guilhermeleobas/205/head -> origin/gh/guilhermeleobas/205/head 2025-09-07T07:51:36.1759881Z * [new branch] gh/guilhermeleobas/205/orig -> origin/gh/guilhermeleobas/205/orig 2025-09-07T07:51:36.1762182Z * [new branch] gh/guilhermeleobas/209/base -> origin/gh/guilhermeleobas/209/base 2025-09-07T07:51:36.1763765Z * [new branch] gh/guilhermeleobas/209/head -> origin/gh/guilhermeleobas/209/head 2025-09-07T07:51:36.1765638Z * [new branch] gh/guilhermeleobas/209/orig -> origin/gh/guilhermeleobas/209/orig 2025-09-07T07:51:36.1768001Z * [new branch] gh/guilhermeleobas/210/base -> origin/gh/guilhermeleobas/210/base 2025-09-07T07:51:36.1769565Z * [new branch] gh/guilhermeleobas/210/head -> origin/gh/guilhermeleobas/210/head 2025-09-07T07:51:36.1771138Z * [new branch] gh/guilhermeleobas/210/orig -> origin/gh/guilhermeleobas/210/orig 2025-09-07T07:51:36.1773456Z * [new branch] gh/guilhermeleobas/211/base -> origin/gh/guilhermeleobas/211/base 2025-09-07T07:51:36.1775184Z * [new branch] gh/guilhermeleobas/211/head -> origin/gh/guilhermeleobas/211/head 2025-09-07T07:51:36.1777133Z * [new branch] gh/guilhermeleobas/211/orig -> origin/gh/guilhermeleobas/211/orig 2025-09-07T07:51:36.1779255Z * [new branch] gh/guilhermeleobas/214/base -> origin/gh/guilhermeleobas/214/base 2025-09-07T07:51:36.1780899Z * [new branch] gh/guilhermeleobas/214/head -> origin/gh/guilhermeleobas/214/head 2025-09-07T07:51:36.1782622Z * [new branch] gh/guilhermeleobas/214/orig -> origin/gh/guilhermeleobas/214/orig 2025-09-07T07:51:36.1784884Z * [new branch] gh/guilhermeleobas/215/base -> origin/gh/guilhermeleobas/215/base 2025-09-07T07:51:36.1786772Z * [new branch] gh/guilhermeleobas/215/head -> origin/gh/guilhermeleobas/215/head 2025-09-07T07:51:36.1788354Z * [new branch] gh/guilhermeleobas/215/orig -> origin/gh/guilhermeleobas/215/orig 2025-09-07T07:51:36.1790969Z * [new branch] gh/guilhermeleobas/216/base -> origin/gh/guilhermeleobas/216/base 2025-09-07T07:51:36.1792566Z * [new branch] gh/guilhermeleobas/216/head -> origin/gh/guilhermeleobas/216/head 2025-09-07T07:51:36.1794087Z * [new branch] gh/guilhermeleobas/216/orig -> origin/gh/guilhermeleobas/216/orig 2025-09-07T07:51:36.1796841Z * [new branch] gh/guilhermeleobas/217/base -> origin/gh/guilhermeleobas/217/base 2025-09-07T07:51:36.1798411Z * [new branch] gh/guilhermeleobas/217/head -> origin/gh/guilhermeleobas/217/head 2025-09-07T07:51:36.1800023Z * [new branch] gh/guilhermeleobas/217/orig -> origin/gh/guilhermeleobas/217/orig 2025-09-07T07:51:36.1802283Z * [new branch] gh/guilhermeleobas/219/base -> origin/gh/guilhermeleobas/219/base 2025-09-07T07:51:36.1803827Z * [new branch] gh/guilhermeleobas/219/head -> origin/gh/guilhermeleobas/219/head 2025-09-07T07:51:36.1805856Z * [new branch] gh/guilhermeleobas/219/orig -> origin/gh/guilhermeleobas/219/orig 2025-09-07T07:51:36.1808238Z * [new branch] gh/guilhermeleobas/220/base -> origin/gh/guilhermeleobas/220/base 2025-09-07T07:51:36.1809855Z * [new branch] gh/guilhermeleobas/220/head -> origin/gh/guilhermeleobas/220/head 2025-09-07T07:51:36.1811472Z * [new branch] gh/guilhermeleobas/220/orig -> origin/gh/guilhermeleobas/220/orig 2025-09-07T07:51:36.1813741Z * [new branch] gh/guilhermeleobas/221/base -> origin/gh/guilhermeleobas/221/base 2025-09-07T07:51:36.1815508Z * [new branch] gh/guilhermeleobas/221/head -> origin/gh/guilhermeleobas/221/head 2025-09-07T07:51:36.1817184Z * [new branch] gh/guilhermeleobas/221/orig -> origin/gh/guilhermeleobas/221/orig 2025-09-07T07:51:36.1819459Z * [new branch] gh/guilhermeleobas/222/base -> origin/gh/guilhermeleobas/222/base 2025-09-07T07:51:36.1821124Z * [new branch] gh/guilhermeleobas/222/head -> origin/gh/guilhermeleobas/222/head 2025-09-07T07:51:36.1822836Z * [new branch] gh/guilhermeleobas/222/orig -> origin/gh/guilhermeleobas/222/orig 2025-09-07T07:51:36.1825255Z * [new branch] gh/guilhermeleobas/223/base -> origin/gh/guilhermeleobas/223/base 2025-09-07T07:51:36.1827021Z * [new branch] gh/guilhermeleobas/223/head -> origin/gh/guilhermeleobas/223/head 2025-09-07T07:51:36.1828635Z * [new branch] gh/guilhermeleobas/223/orig -> origin/gh/guilhermeleobas/223/orig 2025-09-07T07:51:36.1830983Z * [new branch] gh/guilhermeleobas/224/base -> origin/gh/guilhermeleobas/224/base 2025-09-07T07:51:36.1832585Z * [new branch] gh/guilhermeleobas/224/head -> origin/gh/guilhermeleobas/224/head 2025-09-07T07:51:36.1834133Z * [new branch] gh/guilhermeleobas/224/orig -> origin/gh/guilhermeleobas/224/orig 2025-09-07T07:51:36.1836737Z * [new branch] gh/guilhermeleobas/225/base -> origin/gh/guilhermeleobas/225/base 2025-09-07T07:51:36.1838301Z * [new branch] gh/guilhermeleobas/225/head -> origin/gh/guilhermeleobas/225/head 2025-09-07T07:51:36.1839980Z * [new branch] gh/guilhermeleobas/225/orig -> origin/gh/guilhermeleobas/225/orig 2025-09-07T07:51:36.1842053Z * [new branch] gh/guilhermeleobas/226/base -> origin/gh/guilhermeleobas/226/base 2025-09-07T07:51:36.1843729Z * [new branch] gh/guilhermeleobas/226/head -> origin/gh/guilhermeleobas/226/head 2025-09-07T07:51:36.1845326Z * [new branch] gh/guilhermeleobas/226/orig -> origin/gh/guilhermeleobas/226/orig 2025-09-07T07:51:36.1848007Z * [new branch] gh/guilhermeleobas/227/base -> origin/gh/guilhermeleobas/227/base 2025-09-07T07:51:36.1849739Z * [new branch] gh/guilhermeleobas/227/head -> origin/gh/guilhermeleobas/227/head 2025-09-07T07:51:36.1851317Z * [new branch] gh/guilhermeleobas/227/orig -> origin/gh/guilhermeleobas/227/orig 2025-09-07T07:51:36.1853549Z * [new branch] gh/guilhermeleobas/228/base -> origin/gh/guilhermeleobas/228/base 2025-09-07T07:51:36.1855358Z * [new branch] gh/guilhermeleobas/228/head -> origin/gh/guilhermeleobas/228/head 2025-09-07T07:51:36.1856968Z * [new branch] gh/guilhermeleobas/228/orig -> origin/gh/guilhermeleobas/228/orig 2025-09-07T07:51:36.1859243Z * [new branch] gh/guilhermeleobas/229/base -> origin/gh/guilhermeleobas/229/base 2025-09-07T07:51:36.1860904Z * [new branch] gh/guilhermeleobas/229/head -> origin/gh/guilhermeleobas/229/head 2025-09-07T07:51:36.1862692Z * [new branch] gh/guilhermeleobas/229/orig -> origin/gh/guilhermeleobas/229/orig 2025-09-07T07:51:36.1865108Z * [new branch] gh/guilhermeleobas/230/base -> origin/gh/guilhermeleobas/230/base 2025-09-07T07:51:36.1866924Z * [new branch] gh/guilhermeleobas/230/head -> origin/gh/guilhermeleobas/230/head 2025-09-07T07:51:36.1868557Z * [new branch] gh/guilhermeleobas/230/orig -> origin/gh/guilhermeleobas/230/orig 2025-09-07T07:51:36.1870859Z * [new branch] gh/guilhermeleobas/231/base -> origin/gh/guilhermeleobas/231/base 2025-09-07T07:51:36.1872486Z * [new branch] gh/guilhermeleobas/231/head -> origin/gh/guilhermeleobas/231/head 2025-09-07T07:51:36.1874148Z * [new branch] gh/guilhermeleobas/231/orig -> origin/gh/guilhermeleobas/231/orig 2025-09-07T07:51:36.1876705Z * [new branch] gh/guilhermeleobas/232/base -> origin/gh/guilhermeleobas/232/base 2025-09-07T07:51:36.1878294Z * [new branch] gh/guilhermeleobas/232/head -> origin/gh/guilhermeleobas/232/head 2025-09-07T07:51:36.1879846Z * [new branch] gh/guilhermeleobas/232/orig -> origin/gh/guilhermeleobas/232/orig 2025-09-07T07:51:36.1882080Z * [new branch] gh/guilhermeleobas/233/base -> origin/gh/guilhermeleobas/233/base 2025-09-07T07:51:36.1883594Z * [new branch] gh/guilhermeleobas/233/head -> origin/gh/guilhermeleobas/233/head 2025-09-07T07:51:36.1885318Z * [new branch] gh/guilhermeleobas/233/orig -> origin/gh/guilhermeleobas/233/orig 2025-09-07T07:51:36.1887821Z * [new branch] gh/guilhermeleobas/234/base -> origin/gh/guilhermeleobas/234/base 2025-09-07T07:51:36.1889488Z * [new branch] gh/guilhermeleobas/234/head -> origin/gh/guilhermeleobas/234/head 2025-09-07T07:51:36.1891064Z * [new branch] gh/guilhermeleobas/234/orig -> origin/gh/guilhermeleobas/234/orig 2025-09-07T07:51:36.1893282Z * [new branch] gh/guilhermeleobas/235/base -> origin/gh/guilhermeleobas/235/base 2025-09-07T07:51:36.1894863Z * [new branch] gh/guilhermeleobas/235/head -> origin/gh/guilhermeleobas/235/head 2025-09-07T07:51:36.1897177Z * [new branch] gh/guilhermeleobas/235/orig -> origin/gh/guilhermeleobas/235/orig 2025-09-07T07:51:36.1899431Z * [new branch] gh/guilhermeleobas/236/base -> origin/gh/guilhermeleobas/236/base 2025-09-07T07:51:36.1901062Z * [new branch] gh/guilhermeleobas/236/head -> origin/gh/guilhermeleobas/236/head 2025-09-07T07:51:36.1902973Z * [new branch] gh/guilhermeleobas/236/orig -> origin/gh/guilhermeleobas/236/orig 2025-09-07T07:51:36.1905270Z * [new branch] gh/guilhermeleobas/237/base -> origin/gh/guilhermeleobas/237/base 2025-09-07T07:51:36.1907041Z * [new branch] gh/guilhermeleobas/237/head -> origin/gh/guilhermeleobas/237/head 2025-09-07T07:51:36.1908562Z * [new branch] gh/guilhermeleobas/237/orig -> origin/gh/guilhermeleobas/237/orig 2025-09-07T07:51:36.1910897Z * [new branch] gh/guilhermeleobas/238/base -> origin/gh/guilhermeleobas/238/base 2025-09-07T07:51:36.1912544Z * [new branch] gh/guilhermeleobas/238/head -> origin/gh/guilhermeleobas/238/head 2025-09-07T07:51:36.1914055Z * [new branch] gh/guilhermeleobas/238/orig -> origin/gh/guilhermeleobas/238/orig 2025-09-07T07:51:36.1916706Z * [new branch] gh/guilhermeleobas/239/base -> origin/gh/guilhermeleobas/239/base 2025-09-07T07:51:36.1918296Z * [new branch] gh/guilhermeleobas/239/head -> origin/gh/guilhermeleobas/239/head 2025-09-07T07:51:36.1919857Z * [new branch] gh/guilhermeleobas/239/orig -> origin/gh/guilhermeleobas/239/orig 2025-09-07T07:51:36.1922192Z * [new branch] gh/guilhermeleobas/240/base -> origin/gh/guilhermeleobas/240/base 2025-09-07T07:51:36.1923811Z * [new branch] gh/guilhermeleobas/240/head -> origin/gh/guilhermeleobas/240/head 2025-09-07T07:51:36.1925531Z * [new branch] gh/guilhermeleobas/240/orig -> origin/gh/guilhermeleobas/240/orig 2025-09-07T07:51:36.1927996Z * [new branch] gh/guilhermeleobas/241/base -> origin/gh/guilhermeleobas/241/base 2025-09-07T07:51:36.1929676Z * [new branch] gh/guilhermeleobas/241/head -> origin/gh/guilhermeleobas/241/head 2025-09-07T07:51:36.1931219Z * [new branch] gh/guilhermeleobas/241/orig -> origin/gh/guilhermeleobas/241/orig 2025-09-07T07:51:36.1933545Z * [new branch] gh/guilhermeleobas/242/base -> origin/gh/guilhermeleobas/242/base 2025-09-07T07:51:36.1935345Z * [new branch] gh/guilhermeleobas/242/head -> origin/gh/guilhermeleobas/242/head 2025-09-07T07:51:36.1937043Z * [new branch] gh/guilhermeleobas/242/orig -> origin/gh/guilhermeleobas/242/orig 2025-09-07T07:51:36.1939245Z * [new branch] gh/guilhermeleobas/243/base -> origin/gh/guilhermeleobas/243/base 2025-09-07T07:51:36.1940896Z * [new branch] gh/guilhermeleobas/243/head -> origin/gh/guilhermeleobas/243/head 2025-09-07T07:51:36.1942742Z * [new branch] gh/guilhermeleobas/243/orig -> origin/gh/guilhermeleobas/243/orig 2025-09-07T07:51:36.1945237Z * [new branch] gh/guilhermeleobas/244/base -> origin/gh/guilhermeleobas/244/base 2025-09-07T07:51:36.1947065Z * [new branch] gh/guilhermeleobas/244/head -> origin/gh/guilhermeleobas/244/head 2025-09-07T07:51:36.1948644Z * [new branch] gh/guilhermeleobas/244/orig -> origin/gh/guilhermeleobas/244/orig 2025-09-07T07:51:36.1950926Z * [new branch] gh/guilhermeleobas/245/base -> origin/gh/guilhermeleobas/245/base 2025-09-07T07:51:36.1952512Z * [new branch] gh/guilhermeleobas/245/head -> origin/gh/guilhermeleobas/245/head 2025-09-07T07:51:36.1954101Z * [new branch] gh/guilhermeleobas/245/orig -> origin/gh/guilhermeleobas/245/orig 2025-09-07T07:51:36.1956881Z * [new branch] gh/guilhermeleobas/73/base -> origin/gh/guilhermeleobas/73/base 2025-09-07T07:51:36.1958394Z * [new branch] gh/guilhermeleobas/73/head -> origin/gh/guilhermeleobas/73/head 2025-09-07T07:51:36.1959918Z * [new branch] gh/guilhermeleobas/73/orig -> origin/gh/guilhermeleobas/73/orig 2025-09-07T07:51:36.1962679Z * [new branch] gh/henrylhtsang/140/base -> origin/gh/henrylhtsang/140/base 2025-09-07T07:51:36.1964495Z * [new branch] gh/henrylhtsang/140/head -> origin/gh/henrylhtsang/140/head 2025-09-07T07:51:36.1966574Z * [new branch] gh/henrylhtsang/140/orig -> origin/gh/henrylhtsang/140/orig 2025-09-07T07:51:36.1968582Z * [new branch] gh/henrylhtsang/141/base -> origin/gh/henrylhtsang/141/base 2025-09-07T07:51:36.1970210Z * [new branch] gh/henrylhtsang/141/head -> origin/gh/henrylhtsang/141/head 2025-09-07T07:51:36.1971804Z * [new branch] gh/henrylhtsang/141/orig -> origin/gh/henrylhtsang/141/orig 2025-09-07T07:51:36.1974184Z * [new branch] gh/henrylhtsang/142/base -> origin/gh/henrylhtsang/142/base 2025-09-07T07:51:36.1976223Z * [new branch] gh/henrylhtsang/142/head -> origin/gh/henrylhtsang/142/head 2025-09-07T07:51:36.1977848Z * [new branch] gh/henrylhtsang/142/orig -> origin/gh/henrylhtsang/142/orig 2025-09-07T07:51:36.1980186Z * [new branch] gh/henrylhtsang/143/base -> origin/gh/henrylhtsang/143/base 2025-09-07T07:51:36.1981833Z * [new branch] gh/henrylhtsang/143/head -> origin/gh/henrylhtsang/143/head 2025-09-07T07:51:36.1983539Z * [new branch] gh/henrylhtsang/143/orig -> origin/gh/henrylhtsang/143/orig 2025-09-07T07:51:36.1986119Z * [new branch] gh/henrylhtsang/144/base -> origin/gh/henrylhtsang/144/base 2025-09-07T07:51:36.1987720Z * [new branch] gh/henrylhtsang/144/head -> origin/gh/henrylhtsang/144/head 2025-09-07T07:51:36.1989293Z * [new branch] gh/henrylhtsang/144/orig -> origin/gh/henrylhtsang/144/orig 2025-09-07T07:51:36.1991543Z * [new branch] gh/henrylhtsang/145/base -> origin/gh/henrylhtsang/145/base 2025-09-07T07:51:36.1993167Z * [new branch] gh/henrylhtsang/145/head -> origin/gh/henrylhtsang/145/head 2025-09-07T07:51:36.1994744Z * [new branch] gh/henrylhtsang/145/orig -> origin/gh/henrylhtsang/145/orig 2025-09-07T07:51:36.1997322Z * [new branch] gh/henrylhtsang/146/base -> origin/gh/henrylhtsang/146/base 2025-09-07T07:51:36.1999064Z * [new branch] gh/henrylhtsang/146/head -> origin/gh/henrylhtsang/146/head 2025-09-07T07:51:36.2000520Z * [new branch] gh/henrylhtsang/146/orig -> origin/gh/henrylhtsang/146/orig 2025-09-07T07:51:36.2002881Z * [new branch] gh/henrylhtsang/147/base -> origin/gh/henrylhtsang/147/base 2025-09-07T07:51:36.2004578Z * [new branch] gh/henrylhtsang/147/head -> origin/gh/henrylhtsang/147/head 2025-09-07T07:51:36.2006429Z * [new branch] gh/henrylhtsang/147/orig -> origin/gh/henrylhtsang/147/orig 2025-09-07T07:51:36.2008890Z * [new branch] gh/henrylhtsang/148/base -> origin/gh/henrylhtsang/148/base 2025-09-07T07:51:36.2010591Z * [new branch] gh/henrylhtsang/148/head -> origin/gh/henrylhtsang/148/head 2025-09-07T07:51:36.2012186Z * [new branch] gh/henrylhtsang/148/orig -> origin/gh/henrylhtsang/148/orig 2025-09-07T07:51:36.2014455Z * [new branch] gh/henrylhtsang/149/base -> origin/gh/henrylhtsang/149/base 2025-09-07T07:51:36.2016391Z * [new branch] gh/henrylhtsang/149/head -> origin/gh/henrylhtsang/149/head 2025-09-07T07:51:36.2017932Z * [new branch] gh/henrylhtsang/149/orig -> origin/gh/henrylhtsang/149/orig 2025-09-07T07:51:36.2020751Z * [new branch] gh/huydhn/1/next -> origin/gh/huydhn/1/next 2025-09-07T07:51:36.2023075Z * [new branch] gh/huydhn/2/next -> origin/gh/huydhn/2/next 2025-09-07T07:51:36.2025520Z * [new branch] gh/huydhn/3/next -> origin/gh/huydhn/3/next 2025-09-07T07:51:36.2027882Z * [new branch] gh/huydhn/4/next -> origin/gh/huydhn/4/next 2025-09-07T07:51:36.2030115Z * [new branch] gh/huydhn/5/next -> origin/gh/huydhn/5/next 2025-09-07T07:51:36.2032337Z * [new branch] gh/huydhn/6/next -> origin/gh/huydhn/6/next 2025-09-07T07:51:36.2035274Z * [new branch] gh/int3/97/base -> origin/gh/int3/97/base 2025-09-07T07:51:36.2037315Z * [new branch] gh/int3/97/head -> origin/gh/int3/97/head 2025-09-07T07:51:36.2039843Z * [new branch] gh/isuruf/101/base -> origin/gh/isuruf/101/base 2025-09-07T07:51:36.2041335Z * [new branch] gh/isuruf/101/head -> origin/gh/isuruf/101/head 2025-09-07T07:51:36.2043817Z * [new branch] gh/isuruf/141/base -> origin/gh/isuruf/141/base 2025-09-07T07:51:36.2045644Z * [new branch] gh/isuruf/141/head -> origin/gh/isuruf/141/head 2025-09-07T07:51:36.2047246Z * [new branch] gh/isuruf/141/orig -> origin/gh/isuruf/141/orig 2025-09-07T07:51:36.2049511Z * [new branch] gh/isuruf/142/base -> origin/gh/isuruf/142/base 2025-09-07T07:51:36.2051084Z * [new branch] gh/isuruf/142/head -> origin/gh/isuruf/142/head 2025-09-07T07:51:36.2052639Z * [new branch] gh/isuruf/142/orig -> origin/gh/isuruf/142/orig 2025-09-07T07:51:36.2054929Z * [new branch] gh/isuruf/143/base -> origin/gh/isuruf/143/base 2025-09-07T07:51:36.2056765Z * [new branch] gh/isuruf/143/head -> origin/gh/isuruf/143/head 2025-09-07T07:51:36.2058340Z * [new branch] gh/isuruf/143/orig -> origin/gh/isuruf/143/orig 2025-09-07T07:51:36.2060583Z * [new branch] gh/isuruf/144/base -> origin/gh/isuruf/144/base 2025-09-07T07:51:36.2062238Z * [new branch] gh/isuruf/144/head -> origin/gh/isuruf/144/head 2025-09-07T07:51:36.2063793Z * [new branch] gh/isuruf/144/orig -> origin/gh/isuruf/144/orig 2025-09-07T07:51:36.2066345Z * [new branch] gh/isuruf/145/base -> origin/gh/isuruf/145/base 2025-09-07T07:51:36.2067861Z * [new branch] gh/isuruf/145/head -> origin/gh/isuruf/145/head 2025-09-07T07:51:36.2069417Z * [new branch] gh/isuruf/145/orig -> origin/gh/isuruf/145/orig 2025-09-07T07:51:36.2071613Z * [new branch] gh/isuruf/146/base -> origin/gh/isuruf/146/base 2025-09-07T07:51:36.2073321Z * [new branch] gh/isuruf/146/head -> origin/gh/isuruf/146/head 2025-09-07T07:51:36.2074912Z * [new branch] gh/isuruf/146/orig -> origin/gh/isuruf/146/orig 2025-09-07T07:51:36.2077498Z * [new branch] gh/isuruf/81/base -> origin/gh/isuruf/81/base 2025-09-07T07:51:36.2079041Z * [new branch] gh/isuruf/81/head -> origin/gh/isuruf/81/head 2025-09-07T07:51:36.2080552Z * [new branch] gh/isuruf/81/orig -> origin/gh/isuruf/81/orig 2025-09-07T07:51:36.2083314Z * [new branch] gh/jamesjwu/150/base -> origin/gh/jamesjwu/150/base 2025-09-07T07:51:36.2085052Z * [new branch] gh/jamesjwu/150/head -> origin/gh/jamesjwu/150/head 2025-09-07T07:51:36.2086935Z * [new branch] gh/jamesjwu/150/orig -> origin/gh/jamesjwu/150/orig 2025-09-07T07:51:36.2089309Z * [new branch] gh/jamesjwu/154/base -> origin/gh/jamesjwu/154/base 2025-09-07T07:51:36.2090888Z * [new branch] gh/jamesjwu/154/head -> origin/gh/jamesjwu/154/head 2025-09-07T07:51:36.2092433Z * [new branch] gh/jamesjwu/154/orig -> origin/gh/jamesjwu/154/orig 2025-09-07T07:51:36.2094681Z * [new branch] gh/jamesjwu/155/base -> origin/gh/jamesjwu/155/base 2025-09-07T07:51:36.2096608Z * [new branch] gh/jamesjwu/155/head -> origin/gh/jamesjwu/155/head 2025-09-07T07:51:36.2098202Z * [new branch] gh/jamesjwu/155/orig -> origin/gh/jamesjwu/155/orig 2025-09-07T07:51:36.2100428Z * [new branch] gh/jamesjwu/159/base -> origin/gh/jamesjwu/159/base 2025-09-07T07:51:36.2102169Z * [new branch] gh/jamesjwu/159/head -> origin/gh/jamesjwu/159/head 2025-09-07T07:51:36.2103909Z * [new branch] gh/jamesjwu/159/orig -> origin/gh/jamesjwu/159/orig 2025-09-07T07:51:36.2106689Z * [new branch] gh/jamesjwu/163/base -> origin/gh/jamesjwu/163/base 2025-09-07T07:51:36.2108258Z * [new branch] gh/jamesjwu/163/head -> origin/gh/jamesjwu/163/head 2025-09-07T07:51:36.2109793Z * [new branch] gh/jamesjwu/163/orig -> origin/gh/jamesjwu/163/orig 2025-09-07T07:51:36.2112112Z * [new branch] gh/jamesjwu/171/base -> origin/gh/jamesjwu/171/base 2025-09-07T07:51:36.2113626Z * [new branch] gh/jamesjwu/171/head -> origin/gh/jamesjwu/171/head 2025-09-07T07:51:36.2115307Z * [new branch] gh/jamesjwu/171/orig -> origin/gh/jamesjwu/171/orig 2025-09-07T07:51:36.2117738Z * [new branch] gh/jamesjwu/176/base -> origin/gh/jamesjwu/176/base 2025-09-07T07:51:36.2119311Z * [new branch] gh/jamesjwu/176/head -> origin/gh/jamesjwu/176/head 2025-09-07T07:51:36.2121244Z * [new branch] gh/jamesjwu/176/orig -> origin/gh/jamesjwu/176/orig 2025-09-07T07:51:36.2123136Z * [new branch] gh/jamesjwu/181/base -> origin/gh/jamesjwu/181/base 2025-09-07T07:51:36.2124710Z * [new branch] gh/jamesjwu/181/head -> origin/gh/jamesjwu/181/head 2025-09-07T07:51:36.2126624Z * [new branch] gh/jamesjwu/181/orig -> origin/gh/jamesjwu/181/orig 2025-09-07T07:51:36.2128905Z * [new branch] gh/jamesjwu/182/base -> origin/gh/jamesjwu/182/base 2025-09-07T07:51:36.2130445Z * [new branch] gh/jamesjwu/182/head -> origin/gh/jamesjwu/182/head 2025-09-07T07:51:36.2131956Z * [new branch] gh/jamesjwu/182/orig -> origin/gh/jamesjwu/182/orig 2025-09-07T07:51:36.2134150Z * [new branch] gh/jamesjwu/183/base -> origin/gh/jamesjwu/183/base 2025-09-07T07:51:36.2136097Z * [new branch] gh/jamesjwu/183/head -> origin/gh/jamesjwu/183/head 2025-09-07T07:51:36.2137852Z * [new branch] gh/jamesjwu/183/orig -> origin/gh/jamesjwu/183/orig 2025-09-07T07:51:36.2140192Z * [new branch] gh/jamesjwu/184/base -> origin/gh/jamesjwu/184/base 2025-09-07T07:51:36.2141794Z * [new branch] gh/jamesjwu/184/head -> origin/gh/jamesjwu/184/head 2025-09-07T07:51:36.2143509Z * [new branch] gh/jamesjwu/184/orig -> origin/gh/jamesjwu/184/orig 2025-09-07T07:51:36.2146047Z * [new branch] gh/jamesjwu/185/base -> origin/gh/jamesjwu/185/base 2025-09-07T07:51:36.2147687Z * [new branch] gh/jamesjwu/185/head -> origin/gh/jamesjwu/185/head 2025-09-07T07:51:36.2149268Z * [new branch] gh/jamesjwu/185/orig -> origin/gh/jamesjwu/185/orig 2025-09-07T07:51:36.2151516Z * [new branch] gh/jamesjwu/186/base -> origin/gh/jamesjwu/186/base 2025-09-07T07:51:36.2153085Z * [new branch] gh/jamesjwu/186/head -> origin/gh/jamesjwu/186/head 2025-09-07T07:51:36.2154701Z * [new branch] gh/jamesjwu/186/orig -> origin/gh/jamesjwu/186/orig 2025-09-07T07:51:36.2157279Z * [new branch] gh/jamesjwu/187/base -> origin/gh/jamesjwu/187/base 2025-09-07T07:51:36.2158768Z * [new branch] gh/jamesjwu/187/head -> origin/gh/jamesjwu/187/head 2025-09-07T07:51:36.2160213Z * [new branch] gh/jamesjwu/187/orig -> origin/gh/jamesjwu/187/orig 2025-09-07T07:51:36.2162510Z * [new branch] gh/jamesjwu/188/base -> origin/gh/jamesjwu/188/base 2025-09-07T07:51:36.2164105Z * [new branch] gh/jamesjwu/188/head -> origin/gh/jamesjwu/188/head 2025-09-07T07:51:36.2165970Z * [new branch] gh/jamesjwu/188/orig -> origin/gh/jamesjwu/188/orig 2025-09-07T07:51:36.2168339Z * [new branch] gh/jamesjwu/189/base -> origin/gh/jamesjwu/189/base 2025-09-07T07:51:36.2170174Z * [new branch] gh/jamesjwu/189/head -> origin/gh/jamesjwu/189/head 2025-09-07T07:51:36.2171523Z * [new branch] gh/jamesjwu/189/orig -> origin/gh/jamesjwu/189/orig 2025-09-07T07:51:36.2173750Z * [new branch] gh/jamesjwu/190/base -> origin/gh/jamesjwu/190/base 2025-09-07T07:51:36.2175494Z * [new branch] gh/jamesjwu/190/head -> origin/gh/jamesjwu/190/head 2025-09-07T07:51:36.2177355Z * [new branch] gh/jamesjwu/190/orig -> origin/gh/jamesjwu/190/orig 2025-09-07T07:51:36.2179678Z * [new branch] gh/jamesjwu/52/base -> origin/gh/jamesjwu/52/base 2025-09-07T07:51:36.2181360Z * [new branch] gh/jamesjwu/52/head -> origin/gh/jamesjwu/52/head 2025-09-07T07:51:36.2183687Z * [new branch] gh/jamesjwu/53/base -> origin/gh/jamesjwu/53/base 2025-09-07T07:51:36.2185290Z * [new branch] gh/jamesjwu/53/head -> origin/gh/jamesjwu/53/head 2025-09-07T07:51:36.2187739Z * [new branch] gh/jamesjwu/54/base -> origin/gh/jamesjwu/54/base 2025-09-07T07:51:36.2189252Z * [new branch] gh/jamesjwu/54/head -> origin/gh/jamesjwu/54/head 2025-09-07T07:51:36.2191482Z * [new branch] gh/jamesjwu/55/base -> origin/gh/jamesjwu/55/base 2025-09-07T07:51:36.2192976Z * [new branch] gh/jamesjwu/55/head -> origin/gh/jamesjwu/55/head 2025-09-07T07:51:36.2195219Z * [new branch] gh/jamesjwu/56/base -> origin/gh/jamesjwu/56/base 2025-09-07T07:51:36.2196875Z * [new branch] gh/jamesjwu/56/head -> origin/gh/jamesjwu/56/head 2025-09-07T07:51:36.2199071Z * [new branch] gh/jamesjwu/57/base -> origin/gh/jamesjwu/57/base 2025-09-07T07:51:36.2200529Z * [new branch] gh/jamesjwu/57/head -> origin/gh/jamesjwu/57/head 2025-09-07T07:51:36.2202704Z * [new branch] gh/jamesjwu/58/base -> origin/gh/jamesjwu/58/base 2025-09-07T07:51:36.2204347Z * [new branch] gh/jamesjwu/58/head -> origin/gh/jamesjwu/58/head 2025-09-07T07:51:36.2206808Z * [new branch] gh/jamesjwu/59/base -> origin/gh/jamesjwu/59/base 2025-09-07T07:51:36.2208267Z * [new branch] gh/jamesjwu/59/head -> origin/gh/jamesjwu/59/head 2025-09-07T07:51:36.2210407Z * [new branch] gh/jamesjwu/60/base -> origin/gh/jamesjwu/60/base 2025-09-07T07:51:36.2212003Z * [new branch] gh/jamesjwu/60/head -> origin/gh/jamesjwu/60/head 2025-09-07T07:51:36.2214149Z * [new branch] gh/jamesjwu/61/base -> origin/gh/jamesjwu/61/base 2025-09-07T07:51:36.2216069Z * [new branch] gh/jamesjwu/61/head -> origin/gh/jamesjwu/61/head 2025-09-07T07:51:36.2218280Z * [new branch] gh/jamesjwu/62/base -> origin/gh/jamesjwu/62/base 2025-09-07T07:51:36.2219807Z * [new branch] gh/jamesjwu/62/head -> origin/gh/jamesjwu/62/head 2025-09-07T07:51:36.2222167Z * [new branch] gh/jamesjwu/63/base -> origin/gh/jamesjwu/63/base 2025-09-07T07:51:36.2223702Z * [new branch] gh/jamesjwu/63/head -> origin/gh/jamesjwu/63/head 2025-09-07T07:51:36.2226380Z * [new branch] gh/jamesjwu/64/base -> origin/gh/jamesjwu/64/base 2025-09-07T07:51:36.2227932Z * [new branch] gh/jamesjwu/64/head -> origin/gh/jamesjwu/64/head 2025-09-07T07:51:36.2230122Z * [new branch] gh/jamesjwu/65/base -> origin/gh/jamesjwu/65/base 2025-09-07T07:51:36.2231534Z * [new branch] gh/jamesjwu/65/head -> origin/gh/jamesjwu/65/head 2025-09-07T07:51:36.2234559Z * [new branch] gh/janeyx99/165/base -> origin/gh/janeyx99/165/base 2025-09-07T07:51:36.2236498Z * [new branch] gh/janeyx99/165/head -> origin/gh/janeyx99/165/head 2025-09-07T07:51:36.2238293Z * [new branch] gh/janeyx99/165/orig -> origin/gh/janeyx99/165/orig 2025-09-07T07:51:36.2240305Z * [new branch] gh/janeyx99/201/base -> origin/gh/janeyx99/201/base 2025-09-07T07:51:36.2241858Z * [new branch] gh/janeyx99/201/head -> origin/gh/janeyx99/201/head 2025-09-07T07:51:36.2243448Z * [new branch] gh/janeyx99/201/orig -> origin/gh/janeyx99/201/orig 2025-09-07T07:51:36.2246242Z * [new branch] gh/janeyx99/225/base -> origin/gh/janeyx99/225/base 2025-09-07T07:51:36.2247749Z * [new branch] gh/janeyx99/225/head -> origin/gh/janeyx99/225/head 2025-09-07T07:51:36.2249237Z * [new branch] gh/janeyx99/225/orig -> origin/gh/janeyx99/225/orig 2025-09-07T07:51:36.2251465Z * [new branch] gh/janeyx99/296/base -> origin/gh/janeyx99/296/base 2025-09-07T07:51:36.2253041Z * [new branch] gh/janeyx99/296/head -> origin/gh/janeyx99/296/head 2025-09-07T07:51:36.2254649Z * [new branch] gh/janeyx99/296/orig -> origin/gh/janeyx99/296/orig 2025-09-07T07:51:36.2257325Z * [new branch] gh/janeyx99/297/base -> origin/gh/janeyx99/297/base 2025-09-07T07:51:36.2258902Z * [new branch] gh/janeyx99/297/head -> origin/gh/janeyx99/297/head 2025-09-07T07:51:36.2260428Z * [new branch] gh/janeyx99/297/orig -> origin/gh/janeyx99/297/orig 2025-09-07T07:51:36.2262804Z * [new branch] gh/janeyx99/298/base -> origin/gh/janeyx99/298/base 2025-09-07T07:51:36.2264335Z * [new branch] gh/janeyx99/298/head -> origin/gh/janeyx99/298/head 2025-09-07T07:51:36.2266386Z * [new branch] gh/janeyx99/298/orig -> origin/gh/janeyx99/298/orig 2025-09-07T07:51:36.2268619Z * [new branch] gh/janeyx99/299/base -> origin/gh/janeyx99/299/base 2025-09-07T07:51:36.2270184Z * [new branch] gh/janeyx99/299/head -> origin/gh/janeyx99/299/head 2025-09-07T07:51:36.2271691Z * [new branch] gh/janeyx99/299/orig -> origin/gh/janeyx99/299/orig 2025-09-07T07:51:36.2274146Z * [new branch] gh/janeyx99/300/base -> origin/gh/janeyx99/300/base 2025-09-07T07:51:36.2276253Z * [new branch] gh/janeyx99/300/head -> origin/gh/janeyx99/300/head 2025-09-07T07:51:36.2277859Z * [new branch] gh/janeyx99/300/orig -> origin/gh/janeyx99/300/orig 2025-09-07T07:51:36.2280028Z * [new branch] gh/janeyx99/301/base -> origin/gh/janeyx99/301/base 2025-09-07T07:51:36.2281577Z * [new branch] gh/janeyx99/301/head -> origin/gh/janeyx99/301/head 2025-09-07T07:51:36.2283153Z * [new branch] gh/janeyx99/301/orig -> origin/gh/janeyx99/301/orig 2025-09-07T07:51:36.2285512Z * [new branch] gh/janeyx99/302/base -> origin/gh/janeyx99/302/base 2025-09-07T07:51:36.2287224Z * [new branch] gh/janeyx99/302/head -> origin/gh/janeyx99/302/head 2025-09-07T07:51:36.2289378Z * [new branch] gh/janeyx99/303/base -> origin/gh/janeyx99/303/base 2025-09-07T07:51:36.2290855Z * [new branch] gh/janeyx99/303/head -> origin/gh/janeyx99/303/head 2025-09-07T07:51:36.2293182Z * [new branch] gh/janeyx99/88/base -> origin/gh/janeyx99/88/base 2025-09-07T07:51:36.2294881Z * [new branch] gh/janeyx99/88/head -> origin/gh/janeyx99/88/head 2025-09-07T07:51:36.2296769Z * [new branch] gh/janeyx99/88/orig -> origin/gh/janeyx99/88/orig 2025-09-07T07:51:36.2299633Z * [new branch] gh/jansel/360/base -> origin/gh/jansel/360/base 2025-09-07T07:51:36.2301148Z * [new branch] gh/jansel/360/head -> origin/gh/jansel/360/head 2025-09-07T07:51:36.2303489Z * [new branch] gh/jansel/451/base -> origin/gh/jansel/451/base 2025-09-07T07:51:36.2305345Z * [new branch] gh/jansel/451/head -> origin/gh/jansel/451/head 2025-09-07T07:51:36.2306968Z * [new branch] gh/jansel/451/orig -> origin/gh/jansel/451/orig 2025-09-07T07:51:36.2309126Z * [new branch] gh/jansel/462/base -> origin/gh/jansel/462/base 2025-09-07T07:51:36.2310653Z * [new branch] gh/jansel/462/head -> origin/gh/jansel/462/head 2025-09-07T07:51:36.2312354Z * [new branch] gh/jansel/462/orig -> origin/gh/jansel/462/orig 2025-09-07T07:51:36.2314531Z * [new branch] gh/jansel/531/base -> origin/gh/jansel/531/base 2025-09-07T07:51:36.2316456Z * [new branch] gh/jansel/531/head -> origin/gh/jansel/531/head 2025-09-07T07:51:36.2317915Z * [new branch] gh/jansel/531/orig -> origin/gh/jansel/531/orig 2025-09-07T07:51:36.2320816Z * [new branch] gh/jbschlosser/208/head -> origin/gh/jbschlosser/208/head 2025-09-07T07:51:36.2323108Z * [new branch] gh/jbschlosser/247/base -> origin/gh/jbschlosser/247/base 2025-09-07T07:51:36.2324689Z * [new branch] gh/jbschlosser/247/head -> origin/gh/jbschlosser/247/head 2025-09-07T07:51:36.2326862Z * [new branch] gh/jbschlosser/247/orig -> origin/gh/jbschlosser/247/orig 2025-09-07T07:51:36.2329201Z * [new branch] gh/jbschlosser/248/base -> origin/gh/jbschlosser/248/base 2025-09-07T07:51:36.2330991Z * [new branch] gh/jbschlosser/248/head -> origin/gh/jbschlosser/248/head 2025-09-07T07:51:36.2332510Z * [new branch] gh/jbschlosser/248/orig -> origin/gh/jbschlosser/248/orig 2025-09-07T07:51:36.2334800Z * [new branch] gh/jbschlosser/250/base -> origin/gh/jbschlosser/250/base 2025-09-07T07:51:36.2336742Z * [new branch] gh/jbschlosser/250/head -> origin/gh/jbschlosser/250/head 2025-09-07T07:51:36.2338245Z * [new branch] gh/jbschlosser/250/orig -> origin/gh/jbschlosser/250/orig 2025-09-07T07:51:36.2341034Z * [new branch] gh/jiayisunx/59/base -> origin/gh/jiayisunx/59/base 2025-09-07T07:51:36.2342824Z * [new branch] gh/jiayisunx/59/head -> origin/gh/jiayisunx/59/head 2025-09-07T07:51:36.2344372Z * [new branch] gh/jiayisunx/59/orig -> origin/gh/jiayisunx/59/orig 2025-09-07T07:51:36.2346786Z * [new branch] gh/jiayisunx/61/base -> origin/gh/jiayisunx/61/base 2025-09-07T07:51:36.2348441Z * [new branch] gh/jiayisunx/61/head -> origin/gh/jiayisunx/61/head 2025-09-07T07:51:36.2350057Z * [new branch] gh/jiayisunx/61/orig -> origin/gh/jiayisunx/61/orig 2025-09-07T07:51:36.2352253Z * [new branch] gh/jiayisunx/64/base -> origin/gh/jiayisunx/64/base 2025-09-07T07:51:36.2353787Z * [new branch] gh/jiayisunx/64/head -> origin/gh/jiayisunx/64/head 2025-09-07T07:51:36.2355569Z * [new branch] gh/jiayisunx/64/orig -> origin/gh/jiayisunx/64/orig 2025-09-07T07:51:36.2357826Z * [new branch] gh/jiayisunx/65/base -> origin/gh/jiayisunx/65/base 2025-09-07T07:51:36.2359523Z * [new branch] gh/jiayisunx/65/head -> origin/gh/jiayisunx/65/head 2025-09-07T07:51:36.2361078Z * [new branch] gh/jiayisunx/65/orig -> origin/gh/jiayisunx/65/orig 2025-09-07T07:51:36.2363331Z * [new branch] gh/jiayisunx/66/base -> origin/gh/jiayisunx/66/base 2025-09-07T07:51:36.2364899Z * [new branch] gh/jiayisunx/66/head -> origin/gh/jiayisunx/66/head 2025-09-07T07:51:36.2366958Z * [new branch] gh/jiayisunx/66/orig -> origin/gh/jiayisunx/66/orig 2025-09-07T07:51:36.2369193Z * [new branch] gh/jiayisunx/67/base -> origin/gh/jiayisunx/67/base 2025-09-07T07:51:36.2370814Z * [new branch] gh/jiayisunx/67/head -> origin/gh/jiayisunx/67/head 2025-09-07T07:51:36.2372530Z * [new branch] gh/jiayisunx/67/orig -> origin/gh/jiayisunx/67/orig 2025-09-07T07:51:36.2374615Z * [new branch] gh/jiayisunx/68/base -> origin/gh/jiayisunx/68/base 2025-09-07T07:51:36.2376509Z * [new branch] gh/jiayisunx/68/head -> origin/gh/jiayisunx/68/head 2025-09-07T07:51:36.2378103Z * [new branch] gh/jiayisunx/68/orig -> origin/gh/jiayisunx/68/orig 2025-09-07T07:51:36.2380343Z * [new branch] gh/jiayisunx/69/base -> origin/gh/jiayisunx/69/base 2025-09-07T07:51:36.2382132Z * [new branch] gh/jiayisunx/69/head -> origin/gh/jiayisunx/69/head 2025-09-07T07:51:36.2383668Z * [new branch] gh/jiayisunx/69/orig -> origin/gh/jiayisunx/69/orig 2025-09-07T07:51:36.2386283Z * [new branch] gh/jiayisunx/70/base -> origin/gh/jiayisunx/70/base 2025-09-07T07:51:36.2387833Z * [new branch] gh/jiayisunx/70/head -> origin/gh/jiayisunx/70/head 2025-09-07T07:51:36.2389411Z * [new branch] gh/jiayisunx/70/orig -> origin/gh/jiayisunx/70/orig 2025-09-07T07:51:36.2391626Z * [new branch] gh/jiayisunx/71/base -> origin/gh/jiayisunx/71/base 2025-09-07T07:51:36.2393181Z * [new branch] gh/jiayisunx/71/head -> origin/gh/jiayisunx/71/head 2025-09-07T07:51:36.2394721Z * [new branch] gh/jiayisunx/71/orig -> origin/gh/jiayisunx/71/orig 2025-09-07T07:51:36.2397433Z * [new branch] gh/jiayisunx/72/base -> origin/gh/jiayisunx/72/base 2025-09-07T07:51:36.2398941Z * [new branch] gh/jiayisunx/72/head -> origin/gh/jiayisunx/72/head 2025-09-07T07:51:36.2400457Z * [new branch] gh/jiayisunx/72/orig -> origin/gh/jiayisunx/72/orig 2025-09-07T07:51:36.2402680Z * [new branch] gh/jiayisunx/73/base -> origin/gh/jiayisunx/73/base 2025-09-07T07:51:36.2404373Z * [new branch] gh/jiayisunx/73/head -> origin/gh/jiayisunx/73/head 2025-09-07T07:51:36.2406493Z * [new branch] gh/jiayisunx/73/orig -> origin/gh/jiayisunx/73/orig 2025-09-07T07:51:36.2408541Z * [new branch] gh/jiayisunx/74/base -> origin/gh/jiayisunx/74/base 2025-09-07T07:51:36.2410084Z * [new branch] gh/jiayisunx/74/head -> origin/gh/jiayisunx/74/head 2025-09-07T07:51:36.2411666Z * [new branch] gh/jiayisunx/74/orig -> origin/gh/jiayisunx/74/orig 2025-09-07T07:51:36.2413829Z * [new branch] gh/jiayisunx/75/base -> origin/gh/jiayisunx/75/base 2025-09-07T07:51:36.2415529Z * [new branch] gh/jiayisunx/75/head -> origin/gh/jiayisunx/75/head 2025-09-07T07:51:36.2417545Z * [new branch] gh/jiayisunx/75/orig -> origin/gh/jiayisunx/75/orig 2025-09-07T07:51:36.2419627Z * [new branch] gh/jiayisunx/76/base -> origin/gh/jiayisunx/76/base 2025-09-07T07:51:36.2421368Z * [new branch] gh/jiayisunx/76/head -> origin/gh/jiayisunx/76/head 2025-09-07T07:51:36.2422976Z * [new branch] gh/jiayisunx/76/orig -> origin/gh/jiayisunx/76/orig 2025-09-07T07:51:36.2426308Z * [new branch] gh/jjwu@meta.com/1/base -> origin/gh/jjwu@meta.com/1/base 2025-09-07T07:51:36.2427765Z * [new branch] gh/jjwu@meta.com/1/head -> origin/gh/jjwu@meta.com/1/head 2025-09-07T07:51:36.2430528Z * [new branch] gh/justinchuby/111/base -> origin/gh/justinchuby/111/base 2025-09-07T07:51:36.2432314Z * [new branch] gh/justinchuby/111/head -> origin/gh/justinchuby/111/head 2025-09-07T07:51:36.2433990Z * [new branch] gh/justinchuby/111/orig -> origin/gh/justinchuby/111/orig 2025-09-07T07:51:36.2436556Z * [new branch] gh/justinchuby/112/base -> origin/gh/justinchuby/112/base 2025-09-07T07:51:36.2438084Z * [new branch] gh/justinchuby/112/head -> origin/gh/justinchuby/112/head 2025-09-07T07:51:36.2439858Z * [new branch] gh/justinchuby/112/orig -> origin/gh/justinchuby/112/orig 2025-09-07T07:51:36.2442053Z * [new branch] gh/justinchuby/113/base -> origin/gh/justinchuby/113/base 2025-09-07T07:51:36.2443595Z * [new branch] gh/justinchuby/113/head -> origin/gh/justinchuby/113/head 2025-09-07T07:51:36.2445347Z * [new branch] gh/justinchuby/113/orig -> origin/gh/justinchuby/113/orig 2025-09-07T07:51:36.2447634Z * [new branch] gh/justinchuby/114/base -> origin/gh/justinchuby/114/base 2025-09-07T07:51:36.2449312Z * [new branch] gh/justinchuby/114/head -> origin/gh/justinchuby/114/head 2025-09-07T07:51:36.2450820Z * [new branch] gh/justinchuby/114/orig -> origin/gh/justinchuby/114/orig 2025-09-07T07:51:36.2452982Z * [new branch] gh/justinchuby/115/base -> origin/gh/justinchuby/115/base 2025-09-07T07:51:36.2454639Z * [new branch] gh/justinchuby/115/head -> origin/gh/justinchuby/115/head 2025-09-07T07:51:36.2456561Z * [new branch] gh/justinchuby/115/orig -> origin/gh/justinchuby/115/orig 2025-09-07T07:51:36.2459411Z * [new branch] gh/karthickai/1/base -> origin/gh/karthickai/1/base 2025-09-07T07:51:36.2460990Z * [new branch] gh/karthickai/1/head -> origin/gh/karthickai/1/head 2025-09-07T07:51:36.2462768Z * [new branch] gh/karthickai/1/orig -> origin/gh/karthickai/1/orig 2025-09-07T07:51:36.2464927Z * [new branch] gh/karthickai/2/base -> origin/gh/karthickai/2/base 2025-09-07T07:51:36.2466752Z * [new branch] gh/karthickai/2/head -> origin/gh/karthickai/2/head 2025-09-07T07:51:36.2468363Z * [new branch] gh/karthickai/2/orig -> origin/gh/karthickai/2/orig 2025-09-07T07:51:36.2471189Z * [new branch] gh/kurtamohler/32/base -> origin/gh/kurtamohler/32/base 2025-09-07T07:51:36.2472770Z * [new branch] gh/kurtamohler/32/head -> origin/gh/kurtamohler/32/head 2025-09-07T07:51:36.2474268Z * [new branch] gh/kurtamohler/32/orig -> origin/gh/kurtamohler/32/orig 2025-09-07T07:51:36.2476877Z * [new branch] gh/kurtamohler/33/base -> origin/gh/kurtamohler/33/base 2025-09-07T07:51:36.2478473Z * [new branch] gh/kurtamohler/33/head -> origin/gh/kurtamohler/33/head 2025-09-07T07:51:36.2480018Z * [new branch] gh/kurtamohler/33/orig -> origin/gh/kurtamohler/33/orig 2025-09-07T07:51:36.2482270Z * [new branch] gh/kurtamohler/34/base -> origin/gh/kurtamohler/34/base 2025-09-07T07:51:36.2483856Z * [new branch] gh/kurtamohler/34/head -> origin/gh/kurtamohler/34/head 2025-09-07T07:51:36.2485712Z * [new branch] gh/kurtamohler/34/orig -> origin/gh/kurtamohler/34/orig 2025-09-07T07:51:36.2487997Z * [new branch] gh/kurtamohler/41/base -> origin/gh/kurtamohler/41/base 2025-09-07T07:51:36.2489518Z * [new branch] gh/kurtamohler/41/head -> origin/gh/kurtamohler/41/head 2025-09-07T07:51:36.2491056Z * [new branch] gh/kurtamohler/41/orig -> origin/gh/kurtamohler/41/orig 2025-09-07T07:51:36.2493333Z * [new branch] gh/kurtamohler/46/base -> origin/gh/kurtamohler/46/base 2025-09-07T07:51:36.2494898Z * [new branch] gh/kurtamohler/46/head -> origin/gh/kurtamohler/46/head 2025-09-07T07:51:36.2496806Z * [new branch] gh/kurtamohler/46/orig -> origin/gh/kurtamohler/46/orig 2025-09-07T07:51:36.2498990Z * [new branch] gh/kurtamohler/47/base -> origin/gh/kurtamohler/47/base 2025-09-07T07:51:36.2500683Z * [new branch] gh/kurtamohler/47/head -> origin/gh/kurtamohler/47/head 2025-09-07T07:51:36.2502289Z * [new branch] gh/kurtamohler/47/orig -> origin/gh/kurtamohler/47/orig 2025-09-07T07:51:36.2504757Z * [new branch] gh/kurtamohler/48/base -> origin/gh/kurtamohler/48/base 2025-09-07T07:51:36.2506442Z * [new branch] gh/kurtamohler/48/head -> origin/gh/kurtamohler/48/head 2025-09-07T07:51:36.2508033Z * [new branch] gh/kurtamohler/48/orig -> origin/gh/kurtamohler/48/orig 2025-09-07T07:51:36.2510208Z * [new branch] gh/kurtamohler/49/base -> origin/gh/kurtamohler/49/base 2025-09-07T07:51:36.2511709Z * [new branch] gh/kurtamohler/49/head -> origin/gh/kurtamohler/49/head 2025-09-07T07:51:36.2513257Z * [new branch] gh/kurtamohler/49/orig -> origin/gh/kurtamohler/49/orig 2025-09-07T07:51:36.2516040Z * [new branch] gh/kurtamohler/50/base -> origin/gh/kurtamohler/50/base 2025-09-07T07:51:36.2517579Z * [new branch] gh/kurtamohler/50/head -> origin/gh/kurtamohler/50/head 2025-09-07T07:51:36.2519077Z * [new branch] gh/kurtamohler/50/orig -> origin/gh/kurtamohler/50/orig 2025-09-07T07:51:36.2522128Z * [new branch] gh/kwen2501/130/base -> origin/gh/kwen2501/130/base 2025-09-07T07:51:36.2523898Z * [new branch] gh/kwen2501/130/head -> origin/gh/kwen2501/130/head 2025-09-07T07:51:36.2525742Z * [new branch] gh/kwen2501/130/orig -> origin/gh/kwen2501/130/orig 2025-09-07T07:51:36.2528050Z * [new branch] gh/kwen2501/15/base -> origin/gh/kwen2501/15/base 2025-09-07T07:51:36.2529618Z * [new branch] gh/kwen2501/15/head -> origin/gh/kwen2501/15/head 2025-09-07T07:51:36.2531869Z * [new branch] gh/kwen2501/156/base -> origin/gh/kwen2501/156/base 2025-09-07T07:51:36.2533415Z * [new branch] gh/kwen2501/156/head -> origin/gh/kwen2501/156/head 2025-09-07T07:51:36.2535113Z * [new branch] gh/kwen2501/156/orig -> origin/gh/kwen2501/156/orig 2025-09-07T07:51:36.2537618Z * [new branch] gh/kwen2501/170/base -> origin/gh/kwen2501/170/base 2025-09-07T07:51:36.2539263Z * [new branch] gh/kwen2501/170/head -> origin/gh/kwen2501/170/head 2025-09-07T07:51:36.2541546Z * [new branch] gh/kwen2501/186/base -> origin/gh/kwen2501/186/base 2025-09-07T07:51:36.2543192Z * [new branch] gh/kwen2501/186/head -> origin/gh/kwen2501/186/head 2025-09-07T07:51:36.2544763Z * [new branch] gh/kwen2501/186/orig -> origin/gh/kwen2501/186/orig 2025-09-07T07:51:36.2547309Z * [new branch] gh/kwen2501/187/base -> origin/gh/kwen2501/187/base 2025-09-07T07:51:36.2548886Z * [new branch] gh/kwen2501/187/head -> origin/gh/kwen2501/187/head 2025-09-07T07:51:36.2550525Z * [new branch] gh/kwen2501/187/orig -> origin/gh/kwen2501/187/orig 2025-09-07T07:51:36.2552815Z * [new branch] gh/kwen2501/188/base -> origin/gh/kwen2501/188/base 2025-09-07T07:51:36.2554383Z * [new branch] gh/kwen2501/188/head -> origin/gh/kwen2501/188/head 2025-09-07T07:51:36.2556276Z * [new branch] gh/kwen2501/188/orig -> origin/gh/kwen2501/188/orig 2025-09-07T07:51:36.2558457Z * [new branch] gh/kwen2501/194/base -> origin/gh/kwen2501/194/base 2025-09-07T07:51:36.2560900Z * [new branch] gh/kwen2501/194/head -> origin/gh/kwen2501/194/head 2025-09-07T07:51:36.2562560Z * [new branch] gh/kwen2501/194/orig -> origin/gh/kwen2501/194/orig 2025-09-07T07:51:36.2564769Z * [new branch] gh/kwen2501/199/base -> origin/gh/kwen2501/199/base 2025-09-07T07:51:36.2566729Z * [new branch] gh/kwen2501/199/head -> origin/gh/kwen2501/199/head 2025-09-07T07:51:36.2568149Z * [new branch] gh/kwen2501/199/orig -> origin/gh/kwen2501/199/orig 2025-09-07T07:51:36.2570356Z * [new branch] gh/kwen2501/200/base -> origin/gh/kwen2501/200/base 2025-09-07T07:51:36.2572253Z * [new branch] gh/kwen2501/200/head -> origin/gh/kwen2501/200/head 2025-09-07T07:51:36.2573656Z * [new branch] gh/kwen2501/200/orig -> origin/gh/kwen2501/200/orig 2025-09-07T07:51:36.2576193Z * [new branch] gh/kwen2501/201/base -> origin/gh/kwen2501/201/base 2025-09-07T07:51:36.2577774Z * [new branch] gh/kwen2501/201/head -> origin/gh/kwen2501/201/head 2025-09-07T07:51:36.2579302Z * [new branch] gh/kwen2501/201/orig -> origin/gh/kwen2501/201/orig 2025-09-07T07:51:36.2581654Z * [new branch] gh/kwen2501/203/base -> origin/gh/kwen2501/203/base 2025-09-07T07:51:36.2583338Z * [new branch] gh/kwen2501/203/head -> origin/gh/kwen2501/203/head 2025-09-07T07:51:36.2584822Z * [new branch] gh/kwen2501/203/orig -> origin/gh/kwen2501/203/orig 2025-09-07T07:51:36.2587329Z * [new branch] gh/kwen2501/204/base -> origin/gh/kwen2501/204/base 2025-09-07T07:51:36.2588983Z * [new branch] gh/kwen2501/204/head -> origin/gh/kwen2501/204/head 2025-09-07T07:51:36.2590493Z * [new branch] gh/kwen2501/204/orig -> origin/gh/kwen2501/204/orig 2025-09-07T07:51:36.2592756Z * [new branch] gh/kwen2501/205/base -> origin/gh/kwen2501/205/base 2025-09-07T07:51:36.2594344Z * [new branch] gh/kwen2501/205/head -> origin/gh/kwen2501/205/head 2025-09-07T07:51:36.2596314Z * [new branch] gh/kwen2501/205/orig -> origin/gh/kwen2501/205/orig 2025-09-07T07:51:36.2598434Z * [new branch] gh/kwen2501/206/base -> origin/gh/kwen2501/206/base 2025-09-07T07:51:36.2600053Z * [new branch] gh/kwen2501/206/head -> origin/gh/kwen2501/206/head 2025-09-07T07:51:36.2601583Z * [new branch] gh/kwen2501/206/orig -> origin/gh/kwen2501/206/orig 2025-09-07T07:51:36.2603846Z * [new branch] gh/kwen2501/207/base -> origin/gh/kwen2501/207/base 2025-09-07T07:51:36.2605704Z * [new branch] gh/kwen2501/207/head -> origin/gh/kwen2501/207/head 2025-09-07T07:51:36.2607255Z * [new branch] gh/kwen2501/207/orig -> origin/gh/kwen2501/207/orig 2025-09-07T07:51:36.2609542Z * [new branch] gh/kwen2501/208/base -> origin/gh/kwen2501/208/base 2025-09-07T07:51:36.2611158Z * [new branch] gh/kwen2501/208/head -> origin/gh/kwen2501/208/head 2025-09-07T07:51:36.2612691Z * [new branch] gh/kwen2501/208/orig -> origin/gh/kwen2501/208/orig 2025-09-07T07:51:36.2615094Z * [new branch] gh/kwen2501/209/base -> origin/gh/kwen2501/209/base 2025-09-07T07:51:36.2617022Z * [new branch] gh/kwen2501/209/head -> origin/gh/kwen2501/209/head 2025-09-07T07:51:36.2618564Z * [new branch] gh/kwen2501/209/orig -> origin/gh/kwen2501/209/orig 2025-09-07T07:51:36.2620830Z * [new branch] gh/kwen2501/210/base -> origin/gh/kwen2501/210/base 2025-09-07T07:51:36.2622510Z * [new branch] gh/kwen2501/210/head -> origin/gh/kwen2501/210/head 2025-09-07T07:51:36.2624006Z * [new branch] gh/kwen2501/210/orig -> origin/gh/kwen2501/210/orig 2025-09-07T07:51:36.2626824Z * [new branch] gh/kwen2501/211/base -> origin/gh/kwen2501/211/base 2025-09-07T07:51:36.2628433Z * [new branch] gh/kwen2501/211/head -> origin/gh/kwen2501/211/head 2025-09-07T07:51:36.2630699Z * [new branch] gh/kwen2501/212/base -> origin/gh/kwen2501/212/base 2025-09-07T07:51:36.2632224Z * [new branch] gh/kwen2501/212/head -> origin/gh/kwen2501/212/head 2025-09-07T07:51:36.2633827Z * [new branch] gh/kwen2501/212/orig -> origin/gh/kwen2501/212/orig 2025-09-07T07:51:36.2636445Z * [new branch] gh/kwen2501/213/base -> origin/gh/kwen2501/213/base 2025-09-07T07:51:36.2638141Z * [new branch] gh/kwen2501/213/head -> origin/gh/kwen2501/213/head 2025-09-07T07:51:36.2639546Z * [new branch] gh/kwen2501/213/orig -> origin/gh/kwen2501/213/orig 2025-09-07T07:51:36.2641831Z * [new branch] gh/kwen2501/214/base -> origin/gh/kwen2501/214/base 2025-09-07T07:51:36.2643479Z * [new branch] gh/kwen2501/214/head -> origin/gh/kwen2501/214/head 2025-09-07T07:51:36.2645141Z * [new branch] gh/kwen2501/214/orig -> origin/gh/kwen2501/214/orig 2025-09-07T07:51:36.2647675Z * [new branch] gh/kwen2501/215/base -> origin/gh/kwen2501/215/base 2025-09-07T07:51:36.2649238Z * [new branch] gh/kwen2501/215/head -> origin/gh/kwen2501/215/head 2025-09-07T07:51:36.2650786Z * [new branch] gh/kwen2501/215/orig -> origin/gh/kwen2501/215/orig 2025-09-07T07:51:36.2652988Z * [new branch] gh/kwen2501/216/base -> origin/gh/kwen2501/216/base 2025-09-07T07:51:36.2654672Z * [new branch] gh/kwen2501/216/head -> origin/gh/kwen2501/216/head 2025-09-07T07:51:36.2656548Z * [new branch] gh/kwen2501/216/orig -> origin/gh/kwen2501/216/orig 2025-09-07T07:51:36.2658727Z * [new branch] gh/kwen2501/217/base -> origin/gh/kwen2501/217/base 2025-09-07T07:51:36.2660294Z * [new branch] gh/kwen2501/217/head -> origin/gh/kwen2501/217/head 2025-09-07T07:51:36.2661937Z * [new branch] gh/kwen2501/217/orig -> origin/gh/kwen2501/217/orig 2025-09-07T07:51:36.2664198Z * [new branch] gh/kwen2501/218/base -> origin/gh/kwen2501/218/base 2025-09-07T07:51:36.2666142Z * [new branch] gh/kwen2501/218/head -> origin/gh/kwen2501/218/head 2025-09-07T07:51:36.2667771Z * [new branch] gh/kwen2501/218/orig -> origin/gh/kwen2501/218/orig 2025-09-07T07:51:36.2670079Z * [new branch] gh/kwen2501/219/base -> origin/gh/kwen2501/219/base 2025-09-07T07:51:36.2671696Z * [new branch] gh/kwen2501/219/head -> origin/gh/kwen2501/219/head 2025-09-07T07:51:36.2673251Z * [new branch] gh/kwen2501/219/orig -> origin/gh/kwen2501/219/orig 2025-09-07T07:51:36.2675752Z * [new branch] gh/kwen2501/220/base -> origin/gh/kwen2501/220/base 2025-09-07T07:51:36.2677319Z * [new branch] gh/kwen2501/220/head -> origin/gh/kwen2501/220/head 2025-09-07T07:51:36.2678905Z * [new branch] gh/kwen2501/220/orig -> origin/gh/kwen2501/220/orig 2025-09-07T07:51:36.2681220Z * [new branch] gh/kwen2501/221/base -> origin/gh/kwen2501/221/base 2025-09-07T07:51:36.2682832Z * [new branch] gh/kwen2501/221/head -> origin/gh/kwen2501/221/head 2025-09-07T07:51:36.2684356Z * [new branch] gh/kwen2501/221/orig -> origin/gh/kwen2501/221/orig 2025-09-07T07:51:36.2686931Z * [new branch] gh/kwen2501/222/base -> origin/gh/kwen2501/222/base 2025-09-07T07:51:36.2688519Z * [new branch] gh/kwen2501/222/head -> origin/gh/kwen2501/222/head 2025-09-07T07:51:36.2689971Z * [new branch] gh/kwen2501/222/orig -> origin/gh/kwen2501/222/orig 2025-09-07T07:51:36.2692242Z * [new branch] gh/kwen2501/223/base -> origin/gh/kwen2501/223/base 2025-09-07T07:51:36.2693863Z * [new branch] gh/kwen2501/223/head -> origin/gh/kwen2501/223/head 2025-09-07T07:51:36.2695642Z * [new branch] gh/kwen2501/223/orig -> origin/gh/kwen2501/223/orig 2025-09-07T07:51:36.2697904Z * [new branch] gh/kwen2501/224/base -> origin/gh/kwen2501/224/base 2025-09-07T07:51:36.2699462Z * [new branch] gh/kwen2501/224/head -> origin/gh/kwen2501/224/head 2025-09-07T07:51:36.2701011Z * [new branch] gh/kwen2501/224/orig -> origin/gh/kwen2501/224/orig 2025-09-07T07:51:36.2703636Z * [new branch] gh/kwen2501/225/base -> origin/gh/kwen2501/225/base 2025-09-07T07:51:36.2705274Z * [new branch] gh/kwen2501/225/head -> origin/gh/kwen2501/225/head 2025-09-07T07:51:36.2706945Z * [new branch] gh/kwen2501/225/orig -> origin/gh/kwen2501/225/orig 2025-09-07T07:51:36.2709259Z * [new branch] gh/kwen2501/226/base -> origin/gh/kwen2501/226/base 2025-09-07T07:51:36.2710984Z * [new branch] gh/kwen2501/226/head -> origin/gh/kwen2501/226/head 2025-09-07T07:51:36.2712616Z * [new branch] gh/kwen2501/226/orig -> origin/gh/kwen2501/226/orig 2025-09-07T07:51:36.2714868Z * [new branch] gh/kwen2501/227/base -> origin/gh/kwen2501/227/base 2025-09-07T07:51:36.2716820Z * [new branch] gh/kwen2501/227/head -> origin/gh/kwen2501/227/head 2025-09-07T07:51:36.2718352Z * [new branch] gh/kwen2501/227/orig -> origin/gh/kwen2501/227/orig 2025-09-07T07:51:36.2720561Z * [new branch] gh/kwen2501/228/base -> origin/gh/kwen2501/228/base 2025-09-07T07:51:36.2722143Z * [new branch] gh/kwen2501/228/head -> origin/gh/kwen2501/228/head 2025-09-07T07:51:36.2723710Z * [new branch] gh/kwen2501/228/orig -> origin/gh/kwen2501/228/orig 2025-09-07T07:51:36.2726296Z * [new branch] gh/kwen2501/229/base -> origin/gh/kwen2501/229/base 2025-09-07T07:51:36.2727870Z * [new branch] gh/kwen2501/229/head -> origin/gh/kwen2501/229/head 2025-09-07T07:51:36.2729357Z * [new branch] gh/kwen2501/229/orig -> origin/gh/kwen2501/229/orig 2025-09-07T07:51:36.2731708Z * [new branch] gh/kwen2501/230/base -> origin/gh/kwen2501/230/base 2025-09-07T07:51:36.2733297Z * [new branch] gh/kwen2501/230/head -> origin/gh/kwen2501/230/head 2025-09-07T07:51:36.2734789Z * [new branch] gh/kwen2501/230/orig -> origin/gh/kwen2501/230/orig 2025-09-07T07:51:36.2737506Z * [new branch] gh/kwen2501/231/base -> origin/gh/kwen2501/231/base 2025-09-07T07:51:36.2739129Z * [new branch] gh/kwen2501/231/head -> origin/gh/kwen2501/231/head 2025-09-07T07:51:36.2740622Z * [new branch] gh/kwen2501/231/orig -> origin/gh/kwen2501/231/orig 2025-09-07T07:51:36.2743025Z * [new branch] gh/kwen2501/232/base -> origin/gh/kwen2501/232/base 2025-09-07T07:51:36.2744585Z * [new branch] gh/kwen2501/232/head -> origin/gh/kwen2501/232/head 2025-09-07T07:51:36.2746431Z * [new branch] gh/kwen2501/232/orig -> origin/gh/kwen2501/232/orig 2025-09-07T07:51:36.2749364Z * [new branch] gh/laithsakka/156/base -> origin/gh/laithsakka/156/base 2025-09-07T07:51:36.2750983Z * [new branch] gh/laithsakka/156/head -> origin/gh/laithsakka/156/head 2025-09-07T07:51:36.2752520Z * [new branch] gh/laithsakka/156/orig -> origin/gh/laithsakka/156/orig 2025-09-07T07:51:36.2755080Z * [new branch] gh/laithsakka/160/base -> origin/gh/laithsakka/160/base 2025-09-07T07:51:36.2756921Z * [new branch] gh/laithsakka/160/head -> origin/gh/laithsakka/160/head 2025-09-07T07:51:36.2758496Z * [new branch] gh/laithsakka/160/orig -> origin/gh/laithsakka/160/orig 2025-09-07T07:51:36.2760762Z * [new branch] gh/laithsakka/178/base -> origin/gh/laithsakka/178/base 2025-09-07T07:51:36.2762377Z * [new branch] gh/laithsakka/178/head -> origin/gh/laithsakka/178/head 2025-09-07T07:51:36.2763925Z * [new branch] gh/laithsakka/178/orig -> origin/gh/laithsakka/178/orig 2025-09-07T07:51:36.2766475Z * [new branch] gh/laithsakka/191/base -> origin/gh/laithsakka/191/base 2025-09-07T07:51:36.2768044Z * [new branch] gh/laithsakka/191/head -> origin/gh/laithsakka/191/head 2025-09-07T07:51:36.2769669Z * [new branch] gh/laithsakka/191/orig -> origin/gh/laithsakka/191/orig 2025-09-07T07:51:36.2771672Z * [new branch] gh/laithsakka/237/base -> origin/gh/laithsakka/237/base 2025-09-07T07:51:36.2773270Z * [new branch] gh/laithsakka/237/head -> origin/gh/laithsakka/237/head 2025-09-07T07:51:36.2774733Z * [new branch] gh/laithsakka/237/orig -> origin/gh/laithsakka/237/orig 2025-09-07T07:51:36.2777261Z * [new branch] gh/laithsakka/249/base -> origin/gh/laithsakka/249/base 2025-09-07T07:51:36.2778891Z * [new branch] gh/laithsakka/249/head -> origin/gh/laithsakka/249/head 2025-09-07T07:51:36.2780390Z * [new branch] gh/laithsakka/249/orig -> origin/gh/laithsakka/249/orig 2025-09-07T07:51:36.2782922Z * [new branch] gh/laithsakka/251/base -> origin/gh/laithsakka/251/base 2025-09-07T07:51:36.2784503Z * [new branch] gh/laithsakka/251/head -> origin/gh/laithsakka/251/head 2025-09-07T07:51:36.2786329Z * [new branch] gh/laithsakka/251/orig -> origin/gh/laithsakka/251/orig 2025-09-07T07:51:36.2788725Z * [new branch] gh/laithsakka/254/base -> origin/gh/laithsakka/254/base 2025-09-07T07:51:36.2790238Z * [new branch] gh/laithsakka/254/head -> origin/gh/laithsakka/254/head 2025-09-07T07:51:36.2791875Z * [new branch] gh/laithsakka/254/orig -> origin/gh/laithsakka/254/orig 2025-09-07T07:51:36.2794229Z * [new branch] gh/laithsakka/255/base -> origin/gh/laithsakka/255/base 2025-09-07T07:51:36.2796076Z * [new branch] gh/laithsakka/255/head -> origin/gh/laithsakka/255/head 2025-09-07T07:51:36.2797614Z * [new branch] gh/laithsakka/255/orig -> origin/gh/laithsakka/255/orig 2025-09-07T07:51:36.2799883Z * [new branch] gh/laithsakka/256/base -> origin/gh/laithsakka/256/base 2025-09-07T07:51:36.2801482Z * [new branch] gh/laithsakka/256/head -> origin/gh/laithsakka/256/head 2025-09-07T07:51:36.2802967Z * [new branch] gh/laithsakka/256/orig -> origin/gh/laithsakka/256/orig 2025-09-07T07:51:36.2805727Z * [new branch] gh/laithsakka/257/base -> origin/gh/laithsakka/257/base 2025-09-07T07:51:36.2807329Z * [new branch] gh/laithsakka/257/head -> origin/gh/laithsakka/257/head 2025-09-07T07:51:36.2808884Z * [new branch] gh/laithsakka/257/orig -> origin/gh/laithsakka/257/orig 2025-09-07T07:51:36.2811167Z * [new branch] gh/laithsakka/258/base -> origin/gh/laithsakka/258/base 2025-09-07T07:51:36.2812797Z * [new branch] gh/laithsakka/258/head -> origin/gh/laithsakka/258/head 2025-09-07T07:51:36.2814310Z * [new branch] gh/laithsakka/258/orig -> origin/gh/laithsakka/258/orig 2025-09-07T07:51:36.2816876Z * [new branch] gh/laithsakka/259/base -> origin/gh/laithsakka/259/base 2025-09-07T07:51:36.2818513Z * [new branch] gh/laithsakka/259/head -> origin/gh/laithsakka/259/head 2025-09-07T07:51:36.2820085Z * [new branch] gh/laithsakka/259/orig -> origin/gh/laithsakka/259/orig 2025-09-07T07:51:36.2822366Z * [new branch] gh/laithsakka/260/base -> origin/gh/laithsakka/260/base 2025-09-07T07:51:36.2824004Z * [new branch] gh/laithsakka/260/head -> origin/gh/laithsakka/260/head 2025-09-07T07:51:36.2825837Z * [new branch] gh/laithsakka/260/orig -> origin/gh/laithsakka/260/orig 2025-09-07T07:51:36.2828139Z * [new branch] gh/laithsakka/261/base -> origin/gh/laithsakka/261/base 2025-09-07T07:51:36.2829679Z * [new branch] gh/laithsakka/261/head -> origin/gh/laithsakka/261/head 2025-09-07T07:51:36.2831215Z * [new branch] gh/laithsakka/261/orig -> origin/gh/laithsakka/261/orig 2025-09-07T07:51:36.2833842Z * [new branch] gh/laithsakka/262/base -> origin/gh/laithsakka/262/base 2025-09-07T07:51:36.2836272Z * [new branch] gh/laithsakka/262/head -> origin/gh/laithsakka/262/head 2025-09-07T07:51:36.2837797Z * [new branch] gh/laithsakka/262/orig -> origin/gh/laithsakka/262/orig 2025-09-07T07:51:36.2840005Z * [new branch] gh/laithsakka/263/base -> origin/gh/laithsakka/263/base 2025-09-07T07:51:36.2841560Z * [new branch] gh/laithsakka/263/head -> origin/gh/laithsakka/263/head 2025-09-07T07:51:36.2843124Z * [new branch] gh/laithsakka/263/orig -> origin/gh/laithsakka/263/orig 2025-09-07T07:51:36.2845639Z * [new branch] gh/laithsakka/264/base -> origin/gh/laithsakka/264/base 2025-09-07T07:51:36.2847127Z * [new branch] gh/laithsakka/264/head -> origin/gh/laithsakka/264/head 2025-09-07T07:51:36.2848633Z * [new branch] gh/laithsakka/264/orig -> origin/gh/laithsakka/264/orig 2025-09-07T07:51:36.2851176Z * [new branch] gh/laithsakka/265/base -> origin/gh/laithsakka/265/base 2025-09-07T07:51:36.2852742Z * [new branch] gh/laithsakka/265/head -> origin/gh/laithsakka/265/head 2025-09-07T07:51:36.2854295Z * [new branch] gh/laithsakka/265/orig -> origin/gh/laithsakka/265/orig 2025-09-07T07:51:36.2856895Z * [new branch] gh/laithsakka/266/base -> origin/gh/laithsakka/266/base 2025-09-07T07:51:36.2858515Z * [new branch] gh/laithsakka/266/head -> origin/gh/laithsakka/266/head 2025-09-07T07:51:36.2860027Z * [new branch] gh/laithsakka/266/orig -> origin/gh/laithsakka/266/orig 2025-09-07T07:51:36.2862408Z * [new branch] gh/laithsakka/267/base -> origin/gh/laithsakka/267/base 2025-09-07T07:51:36.2864015Z * [new branch] gh/laithsakka/267/head -> origin/gh/laithsakka/267/head 2025-09-07T07:51:36.2865829Z * [new branch] gh/laithsakka/267/orig -> origin/gh/laithsakka/267/orig 2025-09-07T07:51:36.2868113Z * [new branch] gh/laithsakka/268/base -> origin/gh/laithsakka/268/base 2025-09-07T07:51:36.2869635Z * [new branch] gh/laithsakka/268/head -> origin/gh/laithsakka/268/head 2025-09-07T07:51:36.2871224Z * [new branch] gh/laithsakka/268/orig -> origin/gh/laithsakka/268/orig 2025-09-07T07:51:36.2873665Z * [new branch] gh/laithsakka/28/base -> origin/gh/laithsakka/28/base 2025-09-07T07:51:36.2876167Z * [new branch] gh/laithsakka/29/base -> origin/gh/laithsakka/29/base 2025-09-07T07:51:36.2878277Z * [new branch] gh/laithsakka/30/base -> origin/gh/laithsakka/30/base 2025-09-07T07:51:36.2879899Z * [new branch] gh/laithsakka/30/head -> origin/gh/laithsakka/30/head 2025-09-07T07:51:36.2882034Z * [new branch] gh/laithsakka/31/base -> origin/gh/laithsakka/31/base 2025-09-07T07:51:36.2883551Z * [new branch] gh/laithsakka/31/head -> origin/gh/laithsakka/31/head 2025-09-07T07:51:36.2886075Z * [new branch] gh/laithsakka/32/base -> origin/gh/laithsakka/32/base 2025-09-07T07:51:36.2887633Z * [new branch] gh/laithsakka/32/head -> origin/gh/laithsakka/32/head 2025-09-07T07:51:36.2891792Z * [new branch] gh/lucaskabela/1/base -> origin/gh/lucaskabela/1/base 2025-09-07T07:51:36.2893427Z * [new branch] gh/lucaskabela/1/head -> origin/gh/lucaskabela/1/head 2025-09-07T07:51:36.2896072Z * [new branch] gh/lucaskabela/10/base -> origin/gh/lucaskabela/10/base 2025-09-07T07:51:36.2897703Z * [new branch] gh/lucaskabela/10/head -> origin/gh/lucaskabela/10/head 2025-09-07T07:51:36.2899217Z * [new branch] gh/lucaskabela/10/orig -> origin/gh/lucaskabela/10/orig 2025-09-07T07:51:36.2901352Z * [new branch] gh/lucaskabela/11/base -> origin/gh/lucaskabela/11/base 2025-09-07T07:51:36.2903467Z * [new branch] gh/lucaskabela/11/head -> origin/gh/lucaskabela/11/head 2025-09-07T07:51:36.2905080Z * [new branch] gh/lucaskabela/11/orig -> origin/gh/lucaskabela/11/orig 2025-09-07T07:51:36.2907317Z * [new branch] gh/lucaskabela/12/base -> origin/gh/lucaskabela/12/base 2025-09-07T07:51:36.2908914Z * [new branch] gh/lucaskabela/12/head -> origin/gh/lucaskabela/12/head 2025-09-07T07:51:36.2910467Z * [new branch] gh/lucaskabela/12/orig -> origin/gh/lucaskabela/12/orig 2025-09-07T07:51:36.2912719Z * [new branch] gh/lucaskabela/13/base -> origin/gh/lucaskabela/13/base 2025-09-07T07:51:36.2914247Z * [new branch] gh/lucaskabela/13/head -> origin/gh/lucaskabela/13/head 2025-09-07T07:51:36.2916228Z * [new branch] gh/lucaskabela/13/orig -> origin/gh/lucaskabela/13/orig 2025-09-07T07:51:36.2918406Z * [new branch] gh/lucaskabela/14/base -> origin/gh/lucaskabela/14/base 2025-09-07T07:51:36.2919994Z * [new branch] gh/lucaskabela/14/head -> origin/gh/lucaskabela/14/head 2025-09-07T07:51:36.2921562Z * [new branch] gh/lucaskabela/14/orig -> origin/gh/lucaskabela/14/orig 2025-09-07T07:51:36.2923748Z * [new branch] gh/lucaskabela/15/base -> origin/gh/lucaskabela/15/base 2025-09-07T07:51:36.2925514Z * [new branch] gh/lucaskabela/15/head -> origin/gh/lucaskabela/15/head 2025-09-07T07:51:36.2927223Z * [new branch] gh/lucaskabela/15/orig -> origin/gh/lucaskabela/15/orig 2025-09-07T07:51:36.2929371Z * [new branch] gh/lucaskabela/16/base -> origin/gh/lucaskabela/16/base 2025-09-07T07:51:36.2930980Z * [new branch] gh/lucaskabela/16/head -> origin/gh/lucaskabela/16/head 2025-09-07T07:51:36.2932496Z * [new branch] gh/lucaskabela/16/orig -> origin/gh/lucaskabela/16/orig 2025-09-07T07:51:36.2934636Z * [new branch] gh/lucaskabela/17/base -> origin/gh/lucaskabela/17/base 2025-09-07T07:51:36.2936557Z * [new branch] gh/lucaskabela/17/head -> origin/gh/lucaskabela/17/head 2025-09-07T07:51:36.2938124Z * [new branch] gh/lucaskabela/17/orig -> origin/gh/lucaskabela/17/orig 2025-09-07T07:51:36.2940354Z * [new branch] gh/lucaskabela/2/base -> origin/gh/lucaskabela/2/base 2025-09-07T07:51:36.2941986Z * [new branch] gh/lucaskabela/2/head -> origin/gh/lucaskabela/2/head 2025-09-07T07:51:36.2943561Z * [new branch] gh/lucaskabela/2/orig -> origin/gh/lucaskabela/2/orig 2025-09-07T07:51:36.2946280Z * [new branch] gh/lucaskabela/3/base -> origin/gh/lucaskabela/3/base 2025-09-07T07:51:36.2947899Z * [new branch] gh/lucaskabela/3/head -> origin/gh/lucaskabela/3/head 2025-09-07T07:51:36.2949597Z * [new branch] gh/lucaskabela/3/orig -> origin/gh/lucaskabela/3/orig 2025-09-07T07:51:36.2951804Z * [new branch] gh/lucaskabela/4/base -> origin/gh/lucaskabela/4/base 2025-09-07T07:51:36.2953384Z * [new branch] gh/lucaskabela/4/head -> origin/gh/lucaskabela/4/head 2025-09-07T07:51:36.2954921Z * [new branch] gh/lucaskabela/4/orig -> origin/gh/lucaskabela/4/orig 2025-09-07T07:51:36.2957511Z * [new branch] gh/lucaskabela/5/base -> origin/gh/lucaskabela/5/base 2025-09-07T07:51:36.2958953Z * [new branch] gh/lucaskabela/5/head -> origin/gh/lucaskabela/5/head 2025-09-07T07:51:36.2960528Z * [new branch] gh/lucaskabela/5/orig -> origin/gh/lucaskabela/5/orig 2025-09-07T07:51:36.2962740Z * [new branch] gh/lucaskabela/6/base -> origin/gh/lucaskabela/6/base 2025-09-07T07:51:36.2964362Z * [new branch] gh/lucaskabela/6/head -> origin/gh/lucaskabela/6/head 2025-09-07T07:51:36.2966264Z * [new branch] gh/lucaskabela/6/orig -> origin/gh/lucaskabela/6/orig 2025-09-07T07:51:36.2968855Z * [new branch] gh/lucaskabela/7/base -> origin/gh/lucaskabela/7/base 2025-09-07T07:51:36.2970293Z * [new branch] gh/lucaskabela/7/head -> origin/gh/lucaskabela/7/head 2025-09-07T07:51:36.2971777Z * [new branch] gh/lucaskabela/7/orig -> origin/gh/lucaskabela/7/orig 2025-09-07T07:51:36.2974227Z * [new branch] gh/lucaskabela/8/base -> origin/gh/lucaskabela/8/base 2025-09-07T07:51:36.2976077Z * [new branch] gh/lucaskabela/8/head -> origin/gh/lucaskabela/8/head 2025-09-07T07:51:36.2977695Z * [new branch] gh/lucaskabela/8/orig -> origin/gh/lucaskabela/8/orig 2025-09-07T07:51:36.2980167Z * [new branch] gh/lucaskabela/9/base -> origin/gh/lucaskabela/9/base 2025-09-07T07:51:36.2981580Z * [new branch] gh/lucaskabela/9/head -> origin/gh/lucaskabela/9/head 2025-09-07T07:51:36.2983225Z * [new branch] gh/lucaskabela/9/orig -> origin/gh/lucaskabela/9/orig 2025-09-07T07:51:36.2986434Z * [new branch] gh/lw/3/base -> origin/gh/lw/3/base 2025-09-07T07:51:36.2987991Z * [new branch] gh/lw/3/head -> origin/gh/lw/3/head 2025-09-07T07:51:36.2989493Z * [new branch] gh/lw/3/orig -> origin/gh/lw/3/orig 2025-09-07T07:51:36.2992339Z * [new branch] gh/malfet/14/base -> origin/gh/malfet/14/base 2025-09-07T07:51:36.2994668Z * [new branch] gh/malfet/330/base -> origin/gh/malfet/330/base 2025-09-07T07:51:36.2996646Z * [new branch] gh/malfet/330/head -> origin/gh/malfet/330/head 2025-09-07T07:51:36.2998375Z * [new branch] gh/malfet/330/orig -> origin/gh/malfet/330/orig 2025-09-07T07:51:36.3000688Z * [new branch] gh/malfet/396/base -> origin/gh/malfet/396/base 2025-09-07T07:51:36.3002203Z * [new branch] gh/malfet/396/head -> origin/gh/malfet/396/head 2025-09-07T07:51:36.3003767Z * [new branch] gh/malfet/396/orig -> origin/gh/malfet/396/orig 2025-09-07T07:51:36.3006360Z * [new branch] gh/malfet/397/base -> origin/gh/malfet/397/base 2025-09-07T07:51:36.3007954Z * [new branch] gh/malfet/397/head -> origin/gh/malfet/397/head 2025-09-07T07:51:36.3009492Z * [new branch] gh/malfet/397/orig -> origin/gh/malfet/397/orig 2025-09-07T07:51:36.3011641Z * [new branch] gh/malfet/398/base -> origin/gh/malfet/398/base 2025-09-07T07:51:36.3013320Z * [new branch] gh/malfet/398/head -> origin/gh/malfet/398/head 2025-09-07T07:51:36.3014834Z * [new branch] gh/malfet/398/orig -> origin/gh/malfet/398/orig 2025-09-07T07:51:36.3017382Z * [new branch] gh/malfet/399/base -> origin/gh/malfet/399/base 2025-09-07T07:51:36.3018972Z * [new branch] gh/malfet/399/head -> origin/gh/malfet/399/head 2025-09-07T07:51:36.3020473Z * [new branch] gh/malfet/399/orig -> origin/gh/malfet/399/orig 2025-09-07T07:51:36.3022837Z * [new branch] gh/malfet/414/base -> origin/gh/malfet/414/base 2025-09-07T07:51:36.3024426Z * [new branch] gh/malfet/414/head -> origin/gh/malfet/414/head 2025-09-07T07:51:36.3026327Z * [new branch] gh/malfet/414/orig -> origin/gh/malfet/414/orig 2025-09-07T07:51:36.3028598Z * [new branch] gh/malfet/417/base -> origin/gh/malfet/417/base 2025-09-07T07:51:36.3030173Z * [new branch] gh/malfet/417/head -> origin/gh/malfet/417/head 2025-09-07T07:51:36.3031735Z * [new branch] gh/malfet/417/orig -> origin/gh/malfet/417/orig 2025-09-07T07:51:36.3033950Z * [new branch] gh/malfet/418/base -> origin/gh/malfet/418/base 2025-09-07T07:51:36.3036012Z * [new branch] gh/malfet/418/head -> origin/gh/malfet/418/head 2025-09-07T07:51:36.3037508Z * [new branch] gh/malfet/418/orig -> origin/gh/malfet/418/orig 2025-09-07T07:51:36.3039785Z * [new branch] gh/malfet/475/base -> origin/gh/malfet/475/base 2025-09-07T07:51:36.3041435Z * [new branch] gh/malfet/475/head -> origin/gh/malfet/475/head 2025-09-07T07:51:36.3043045Z * [new branch] gh/malfet/475/orig -> origin/gh/malfet/475/orig 2025-09-07T07:51:36.3045449Z * [new branch] gh/malfet/476/base -> origin/gh/malfet/476/base 2025-09-07T07:51:36.3047115Z * [new branch] gh/malfet/476/head -> origin/gh/malfet/476/head 2025-09-07T07:51:36.3048694Z * [new branch] gh/malfet/476/orig -> origin/gh/malfet/476/orig 2025-09-07T07:51:36.3050858Z * [new branch] gh/malfet/477/base -> origin/gh/malfet/477/base 2025-09-07T07:51:36.3052436Z * [new branch] gh/malfet/477/head -> origin/gh/malfet/477/head 2025-09-07T07:51:36.3054055Z * [new branch] gh/malfet/477/orig -> origin/gh/malfet/477/orig 2025-09-07T07:51:36.3056495Z * [new branch] gh/malfet/478/base -> origin/gh/malfet/478/base 2025-09-07T07:51:36.3058143Z * [new branch] gh/malfet/478/head -> origin/gh/malfet/478/head 2025-09-07T07:51:36.3059700Z * [new branch] gh/malfet/478/orig -> origin/gh/malfet/478/orig 2025-09-07T07:51:36.3062069Z * [new branch] gh/malfet/479/base -> origin/gh/malfet/479/base 2025-09-07T07:51:36.3063680Z * [new branch] gh/malfet/479/head -> origin/gh/malfet/479/head 2025-09-07T07:51:36.3065396Z * [new branch] gh/malfet/479/orig -> origin/gh/malfet/479/orig 2025-09-07T07:51:36.3070264Z * [new branch] gh/malfet/480/base -> origin/gh/malfet/480/base 2025-09-07T07:51:36.3072381Z * [new branch] gh/malfet/480/head -> origin/gh/malfet/480/head 2025-09-07T07:51:36.3072561Z * [new branch] gh/malfet/480/orig -> origin/gh/malfet/480/orig 2025-09-07T07:51:36.3073719Z * [new branch] gh/malfet/481/base -> origin/gh/malfet/481/base 2025-09-07T07:51:36.3075328Z * [new branch] gh/malfet/481/head -> origin/gh/malfet/481/head 2025-09-07T07:51:36.3076957Z * [new branch] gh/malfet/481/orig -> origin/gh/malfet/481/orig 2025-09-07T07:51:36.3079193Z * [new branch] gh/malfet/482/base -> origin/gh/malfet/482/base 2025-09-07T07:51:36.3080750Z * [new branch] gh/malfet/482/head -> origin/gh/malfet/482/head 2025-09-07T07:51:36.3082331Z * [new branch] gh/malfet/482/orig -> origin/gh/malfet/482/orig 2025-09-07T07:51:36.3084565Z * [new branch] gh/malfet/483/base -> origin/gh/malfet/483/base 2025-09-07T07:51:36.3086524Z * [new branch] gh/malfet/483/head -> origin/gh/malfet/483/head 2025-09-07T07:51:36.3087975Z * [new branch] gh/malfet/483/orig -> origin/gh/malfet/483/orig 2025-09-07T07:51:36.3090216Z * [new branch] gh/malfet/484/base -> origin/gh/malfet/484/base 2025-09-07T07:51:36.3092042Z * [new branch] gh/malfet/484/head -> origin/gh/malfet/484/head 2025-09-07T07:51:36.3093720Z * [new branch] gh/malfet/484/orig -> origin/gh/malfet/484/orig 2025-09-07T07:51:36.3096240Z * [new branch] gh/malfet/485/base -> origin/gh/malfet/485/base 2025-09-07T07:51:36.3097835Z * [new branch] gh/malfet/485/head -> origin/gh/malfet/485/head 2025-09-07T07:51:36.3099494Z * [new branch] gh/malfet/485/orig -> origin/gh/malfet/485/orig 2025-09-07T07:51:36.3101838Z * [new branch] gh/malfet/486/base -> origin/gh/malfet/486/base 2025-09-07T07:51:36.3103622Z * [new branch] gh/malfet/486/head -> origin/gh/malfet/486/head 2025-09-07T07:51:36.3105324Z * [new branch] gh/malfet/486/orig -> origin/gh/malfet/486/orig 2025-09-07T07:51:36.3107652Z * [new branch] gh/malfet/487/base -> origin/gh/malfet/487/base 2025-09-07T07:51:36.3109177Z * [new branch] gh/malfet/487/head -> origin/gh/malfet/487/head 2025-09-07T07:51:36.3110741Z * [new branch] gh/malfet/487/orig -> origin/gh/malfet/487/orig 2025-09-07T07:51:36.3113046Z * [new branch] gh/malfet/488/base -> origin/gh/malfet/488/base 2025-09-07T07:51:36.3114565Z * [new branch] gh/malfet/488/head -> origin/gh/malfet/488/head 2025-09-07T07:51:36.3116436Z * [new branch] gh/malfet/488/orig -> origin/gh/malfet/488/orig 2025-09-07T07:51:36.3118766Z * [new branch] gh/malfet/489/base -> origin/gh/malfet/489/base 2025-09-07T07:51:36.3120400Z * [new branch] gh/malfet/489/head -> origin/gh/malfet/489/head 2025-09-07T07:51:36.3122017Z * [new branch] gh/malfet/489/orig -> origin/gh/malfet/489/orig 2025-09-07T07:51:36.3124320Z * [new branch] gh/malfet/490/base -> origin/gh/malfet/490/base 2025-09-07T07:51:36.3126213Z * [new branch] gh/malfet/490/head -> origin/gh/malfet/490/head 2025-09-07T07:51:36.3127907Z * [new branch] gh/malfet/490/orig -> origin/gh/malfet/490/orig 2025-09-07T07:51:36.3130098Z * [new branch] gh/malfet/491/base -> origin/gh/malfet/491/base 2025-09-07T07:51:36.3131752Z * [new branch] gh/malfet/491/head -> origin/gh/malfet/491/head 2025-09-07T07:51:36.3133370Z * [new branch] gh/malfet/491/orig -> origin/gh/malfet/491/orig 2025-09-07T07:51:36.3135796Z * [new branch] gh/malfet/492/base -> origin/gh/malfet/492/base 2025-09-07T07:51:36.3137551Z * [new branch] gh/malfet/492/head -> origin/gh/malfet/492/head 2025-09-07T07:51:36.3139120Z * [new branch] gh/malfet/492/orig -> origin/gh/malfet/492/orig 2025-09-07T07:51:36.3141543Z * [new branch] gh/malfet/493/base -> origin/gh/malfet/493/base 2025-09-07T07:51:36.3143146Z * [new branch] gh/malfet/493/head -> origin/gh/malfet/493/head 2025-09-07T07:51:36.3144670Z * [new branch] gh/malfet/493/orig -> origin/gh/malfet/493/orig 2025-09-07T07:51:36.3147118Z * [new branch] gh/malfet/494/base -> origin/gh/malfet/494/base 2025-09-07T07:51:36.3148723Z * [new branch] gh/malfet/494/head -> origin/gh/malfet/494/head 2025-09-07T07:51:36.3150426Z * [new branch] gh/malfet/494/orig -> origin/gh/malfet/494/orig 2025-09-07T07:51:36.3152644Z * [new branch] gh/malfet/495/base -> origin/gh/malfet/495/base 2025-09-07T07:51:36.3154319Z * [new branch] gh/malfet/495/head -> origin/gh/malfet/495/head 2025-09-07T07:51:36.3156144Z * [new branch] gh/malfet/495/orig -> origin/gh/malfet/495/orig 2025-09-07T07:51:36.3158430Z * [new branch] gh/malfet/496/base -> origin/gh/malfet/496/base 2025-09-07T07:51:36.3160023Z * [new branch] gh/malfet/496/head -> origin/gh/malfet/496/head 2025-09-07T07:51:36.3161589Z * [new branch] gh/malfet/496/orig -> origin/gh/malfet/496/orig 2025-09-07T07:51:36.3163822Z * [new branch] gh/malfet/497/base -> origin/gh/malfet/497/base 2025-09-07T07:51:36.3165727Z * [new branch] gh/malfet/497/head -> origin/gh/malfet/497/head 2025-09-07T07:51:36.3167426Z * [new branch] gh/malfet/497/orig -> origin/gh/malfet/497/orig 2025-09-07T07:51:36.3169891Z * [new branch] gh/malfet/498/base -> origin/gh/malfet/498/base 2025-09-07T07:51:36.3171359Z * [new branch] gh/malfet/498/head -> origin/gh/malfet/498/head 2025-09-07T07:51:36.3172935Z * [new branch] gh/malfet/498/orig -> origin/gh/malfet/498/orig 2025-09-07T07:51:36.3175388Z * [new branch] gh/malfet/499/base -> origin/gh/malfet/499/base 2025-09-07T07:51:36.3177054Z * [new branch] gh/malfet/499/head -> origin/gh/malfet/499/head 2025-09-07T07:51:36.3178572Z * [new branch] gh/malfet/499/orig -> origin/gh/malfet/499/orig 2025-09-07T07:51:36.3180887Z * [new branch] gh/malfet/500/base -> origin/gh/malfet/500/base 2025-09-07T07:51:36.3182636Z * [new branch] gh/malfet/500/head -> origin/gh/malfet/500/head 2025-09-07T07:51:36.3184222Z * [new branch] gh/malfet/500/orig -> origin/gh/malfet/500/orig 2025-09-07T07:51:36.3187148Z * [new branch] gh/malfet/501/base -> origin/gh/malfet/501/base 2025-09-07T07:51:36.3188719Z * [new branch] gh/malfet/501/head -> origin/gh/malfet/501/head 2025-09-07T07:51:36.3190265Z * [new branch] gh/malfet/501/orig -> origin/gh/malfet/501/orig 2025-09-07T07:51:36.3192528Z * [new branch] gh/malfet/502/base -> origin/gh/malfet/502/base 2025-09-07T07:51:36.3194167Z * [new branch] gh/malfet/502/head -> origin/gh/malfet/502/head 2025-09-07T07:51:36.3196200Z * [new branch] gh/malfet/502/orig -> origin/gh/malfet/502/orig 2025-09-07T07:51:36.3198470Z * [new branch] gh/malfet/503/base -> origin/gh/malfet/503/base 2025-09-07T07:51:36.3200027Z * [new branch] gh/malfet/503/head -> origin/gh/malfet/503/head 2025-09-07T07:51:36.3201617Z * [new branch] gh/malfet/503/orig -> origin/gh/malfet/503/orig 2025-09-07T07:51:36.3203876Z * [new branch] gh/malfet/504/base -> origin/gh/malfet/504/base 2025-09-07T07:51:36.3205738Z * [new branch] gh/malfet/504/head -> origin/gh/malfet/504/head 2025-09-07T07:51:36.3207372Z * [new branch] gh/malfet/504/orig -> origin/gh/malfet/504/orig 2025-09-07T07:51:36.3209763Z * [new branch] gh/malfet/505/base -> origin/gh/malfet/505/base 2025-09-07T07:51:36.3211297Z * [new branch] gh/malfet/505/head -> origin/gh/malfet/505/head 2025-09-07T07:51:36.3212907Z * [new branch] gh/malfet/505/orig -> origin/gh/malfet/505/orig 2025-09-07T07:51:36.3215444Z * [new branch] gh/malfet/506/base -> origin/gh/malfet/506/base 2025-09-07T07:51:36.3217090Z * [new branch] gh/malfet/506/head -> origin/gh/malfet/506/head 2025-09-07T07:51:36.3218702Z * [new branch] gh/malfet/506/orig -> origin/gh/malfet/506/orig 2025-09-07T07:51:36.3220985Z * [new branch] gh/malfet/507/base -> origin/gh/malfet/507/base 2025-09-07T07:51:36.3222749Z * [new branch] gh/malfet/507/head -> origin/gh/malfet/507/head 2025-09-07T07:51:36.3224294Z * [new branch] gh/malfet/507/orig -> origin/gh/malfet/507/orig 2025-09-07T07:51:36.3227027Z * [new branch] gh/malfet/508/base -> origin/gh/malfet/508/base 2025-09-07T07:51:36.3228644Z * [new branch] gh/malfet/508/head -> origin/gh/malfet/508/head 2025-09-07T07:51:36.3230261Z * [new branch] gh/malfet/508/orig -> origin/gh/malfet/508/orig 2025-09-07T07:51:36.3232468Z * [new branch] gh/malfet/509/base -> origin/gh/malfet/509/base 2025-09-07T07:51:36.3234132Z * [new branch] gh/malfet/509/head -> origin/gh/malfet/509/head 2025-09-07T07:51:36.3236022Z * [new branch] gh/malfet/509/orig -> origin/gh/malfet/509/orig 2025-09-07T07:51:36.3238442Z * [new branch] gh/malfet/510/base -> origin/gh/malfet/510/base 2025-09-07T07:51:36.3239977Z * [new branch] gh/malfet/510/head -> origin/gh/malfet/510/head 2025-09-07T07:51:36.3241494Z * [new branch] gh/malfet/510/orig -> origin/gh/malfet/510/orig 2025-09-07T07:51:36.3243774Z * [new branch] gh/malfet/511/base -> origin/gh/malfet/511/base 2025-09-07T07:51:36.3245645Z * [new branch] gh/malfet/511/head -> origin/gh/malfet/511/head 2025-09-07T07:51:36.3247288Z * [new branch] gh/malfet/511/orig -> origin/gh/malfet/511/orig 2025-09-07T07:51:36.3249612Z * [new branch] gh/malfet/512/base -> origin/gh/malfet/512/base 2025-09-07T07:51:36.3251188Z * [new branch] gh/malfet/512/head -> origin/gh/malfet/512/head 2025-09-07T07:51:36.3252878Z * [new branch] gh/malfet/512/orig -> origin/gh/malfet/512/orig 2025-09-07T07:51:36.3255580Z * [new branch] gh/malfet/513/base -> origin/gh/malfet/513/base 2025-09-07T07:51:36.3257027Z * [new branch] gh/malfet/513/head -> origin/gh/malfet/513/head 2025-09-07T07:51:36.3258570Z * [new branch] gh/malfet/513/orig -> origin/gh/malfet/513/orig 2025-09-07T07:51:36.3260915Z * [new branch] gh/malfet/64/base -> origin/gh/malfet/64/base 2025-09-07T07:51:36.3262658Z * [new branch] gh/malfet/64/head -> origin/gh/malfet/64/head 2025-09-07T07:51:36.3265767Z * [new branch] gh/manuelcandales/10/base -> origin/gh/manuelcandales/10/base 2025-09-07T07:51:36.3267396Z * [new branch] gh/manuelcandales/10/head -> origin/gh/manuelcandales/10/head 2025-09-07T07:51:36.3268991Z * [new branch] gh/manuelcandales/10/orig -> origin/gh/manuelcandales/10/orig 2025-09-07T07:51:36.3271257Z * [new branch] gh/manuelcandales/11/base -> origin/gh/manuelcandales/11/base 2025-09-07T07:51:36.3272843Z * [new branch] gh/manuelcandales/11/head -> origin/gh/manuelcandales/11/head 2025-09-07T07:51:36.3274409Z * [new branch] gh/manuelcandales/11/orig -> origin/gh/manuelcandales/11/orig 2025-09-07T07:51:36.3276976Z * [new branch] gh/manuelcandales/9/base -> origin/gh/manuelcandales/9/base 2025-09-07T07:51:36.3278721Z * [new branch] gh/manuelcandales/9/head -> origin/gh/manuelcandales/9/head 2025-09-07T07:51:36.3280166Z * [new branch] gh/manuelcandales/9/orig -> origin/gh/manuelcandales/9/orig 2025-09-07T07:51:36.3283245Z * [new branch] gh/markkm/1/base -> origin/gh/markkm/1/base 2025-09-07T07:51:36.3287269Z * [new branch] gh/masnesral/204/base -> origin/gh/masnesral/204/base 2025-09-07T07:51:36.3288997Z * [new branch] gh/masnesral/204/head -> origin/gh/masnesral/204/head 2025-09-07T07:51:36.3290683Z * [new branch] gh/masnesral/204/orig -> origin/gh/masnesral/204/orig 2025-09-07T07:51:36.3293077Z * [new branch] gh/masnesral/235/base -> origin/gh/masnesral/235/base 2025-09-07T07:51:36.3294690Z * [new branch] gh/masnesral/235/head -> origin/gh/masnesral/235/head 2025-09-07T07:51:36.3297132Z * [new branch] gh/masnesral/235/orig -> origin/gh/masnesral/235/orig 2025-09-07T07:51:36.3298987Z * [new branch] gh/masnesral/34/base -> origin/gh/masnesral/34/base 2025-09-07T07:51:36.3301957Z * [new branch] gh/mhorowitz/0/base -> origin/gh/mhorowitz/0/base 2025-09-07T07:51:36.3303734Z * [new branch] gh/mhorowitz/0/head -> origin/gh/mhorowitz/0/head 2025-09-07T07:51:36.3306012Z * [new branch] gh/mhorowitz/1/base -> origin/gh/mhorowitz/1/base 2025-09-07T07:51:36.3307885Z * [new branch] gh/mhorowitz/1/head -> origin/gh/mhorowitz/1/head 2025-09-07T07:51:36.3310056Z * [new branch] gh/mhorowitz/2/base -> origin/gh/mhorowitz/2/base 2025-09-07T07:51:36.3311550Z * [new branch] gh/mhorowitz/2/head -> origin/gh/mhorowitz/2/head 2025-09-07T07:51:36.3313651Z * [new branch] gh/mhorowitz/3/base -> origin/gh/mhorowitz/3/base 2025-09-07T07:51:36.3332949Z * [new branch] gh/mhorowitz/3/head -> origin/gh/mhorowitz/3/head 2025-09-07T07:51:36.3333230Z * [new branch] gh/mhorowitz/4/base -> origin/gh/mhorowitz/4/base 2025-09-07T07:51:36.3333415Z * [new branch] gh/mhorowitz/4/head -> origin/gh/mhorowitz/4/head 2025-09-07T07:51:36.3333583Z * [new branch] gh/mhorowitz/5/base -> origin/gh/mhorowitz/5/base 2025-09-07T07:51:36.3333745Z * [new branch] gh/mhorowitz/5/head -> origin/gh/mhorowitz/5/head 2025-09-07T07:51:36.3333899Z * [new branch] gh/mhorowitz/6/base -> origin/gh/mhorowitz/6/base 2025-09-07T07:51:36.3334077Z * [new branch] gh/mhorowitz/6/head -> origin/gh/mhorowitz/6/head 2025-09-07T07:51:36.3334294Z * [new branch] gh/mikaylagawarecki/234/base -> origin/gh/mikaylagawarecki/234/base 2025-09-07T07:51:36.3334500Z * [new branch] gh/mikaylagawarecki/234/head -> origin/gh/mikaylagawarecki/234/head 2025-09-07T07:51:36.3334702Z * [new branch] gh/mikaylagawarecki/235/base -> origin/gh/mikaylagawarecki/235/base 2025-09-07T07:51:36.3335366Z * [new branch] gh/mikaylagawarecki/235/head -> origin/gh/mikaylagawarecki/235/head 2025-09-07T07:51:36.3337828Z * [new branch] gh/mikaylagawarecki/236/base -> origin/gh/mikaylagawarecki/236/base 2025-09-07T07:51:36.3339303Z * [new branch] gh/mikaylagawarecki/236/head -> origin/gh/mikaylagawarecki/236/head 2025-09-07T07:51:36.3341606Z * [new branch] gh/mikaylagawarecki/237/base -> origin/gh/mikaylagawarecki/237/base 2025-09-07T07:51:36.3343229Z * [new branch] gh/mikaylagawarecki/237/head -> origin/gh/mikaylagawarecki/237/head 2025-09-07T07:51:36.3345715Z * [new branch] gh/mikaylagawarecki/238/base -> origin/gh/mikaylagawarecki/238/base 2025-09-07T07:51:36.3347384Z * [new branch] gh/mikaylagawarecki/238/head -> origin/gh/mikaylagawarecki/238/head 2025-09-07T07:51:36.3349720Z * [new branch] gh/mikaylagawarecki/317/base -> origin/gh/mikaylagawarecki/317/base 2025-09-07T07:51:36.3351348Z * [new branch] gh/mikaylagawarecki/317/head -> origin/gh/mikaylagawarecki/317/head 2025-09-07T07:51:36.3352978Z * [new branch] gh/mikaylagawarecki/317/orig -> origin/gh/mikaylagawarecki/317/orig 2025-09-07T07:51:36.3355453Z * [new branch] gh/mikaylagawarecki/320/base -> origin/gh/mikaylagawarecki/320/base 2025-09-07T07:51:36.3357184Z * [new branch] gh/mikaylagawarecki/320/head -> origin/gh/mikaylagawarecki/320/head 2025-09-07T07:51:36.3358801Z * [new branch] gh/mikaylagawarecki/320/orig -> origin/gh/mikaylagawarecki/320/orig 2025-09-07T07:51:36.3361065Z * [new branch] gh/mikaylagawarecki/329/base -> origin/gh/mikaylagawarecki/329/base 2025-09-07T07:51:36.3362597Z * [new branch] gh/mikaylagawarecki/329/head -> origin/gh/mikaylagawarecki/329/head 2025-09-07T07:51:36.3364209Z * [new branch] gh/mikaylagawarecki/329/orig -> origin/gh/mikaylagawarecki/329/orig 2025-09-07T07:51:36.3366819Z * [new branch] gh/mikaylagawarecki/330/base -> origin/gh/mikaylagawarecki/330/base 2025-09-07T07:51:36.3368350Z * [new branch] gh/mikaylagawarecki/330/head -> origin/gh/mikaylagawarecki/330/head 2025-09-07T07:51:36.3369989Z * [new branch] gh/mikaylagawarecki/330/orig -> origin/gh/mikaylagawarecki/330/orig 2025-09-07T07:51:36.3372265Z * [new branch] gh/mikaylagawarecki/331/base -> origin/gh/mikaylagawarecki/331/base 2025-09-07T07:51:36.3374019Z * [new branch] gh/mikaylagawarecki/331/head -> origin/gh/mikaylagawarecki/331/head 2025-09-07T07:51:36.3375714Z * [new branch] gh/mikaylagawarecki/331/orig -> origin/gh/mikaylagawarecki/331/orig 2025-09-07T07:51:36.3378193Z * [new branch] gh/mikaylagawarecki/332/base -> origin/gh/mikaylagawarecki/332/base 2025-09-07T07:51:36.3379780Z * [new branch] gh/mikaylagawarecki/332/head -> origin/gh/mikaylagawarecki/332/head 2025-09-07T07:51:36.3381339Z * [new branch] gh/mikaylagawarecki/332/orig -> origin/gh/mikaylagawarecki/332/orig 2025-09-07T07:51:36.3383978Z * [new branch] gh/mikaylagawarecki/334/base -> origin/gh/mikaylagawarecki/334/base 2025-09-07T07:51:36.3385854Z * [new branch] gh/mikaylagawarecki/334/head -> origin/gh/mikaylagawarecki/334/head 2025-09-07T07:51:36.3387475Z * [new branch] gh/mikaylagawarecki/334/orig -> origin/gh/mikaylagawarecki/334/orig 2025-09-07T07:51:36.3389786Z * [new branch] gh/mikaylagawarecki/335/base -> origin/gh/mikaylagawarecki/335/base 2025-09-07T07:51:36.3391423Z * [new branch] gh/mikaylagawarecki/335/head -> origin/gh/mikaylagawarecki/335/head 2025-09-07T07:51:36.3393027Z * [new branch] gh/mikaylagawarecki/335/orig -> origin/gh/mikaylagawarecki/335/orig 2025-09-07T07:51:36.3395495Z * [new branch] gh/mikaylagawarecki/336/base -> origin/gh/mikaylagawarecki/336/base 2025-09-07T07:51:36.3397220Z * [new branch] gh/mikaylagawarecki/336/head -> origin/gh/mikaylagawarecki/336/head 2025-09-07T07:51:36.3398756Z * [new branch] gh/mikaylagawarecki/336/orig -> origin/gh/mikaylagawarecki/336/orig 2025-09-07T07:51:36.3400901Z * [new branch] gh/mikaylagawarecki/337/base -> origin/gh/mikaylagawarecki/337/base 2025-09-07T07:51:36.3402471Z * [new branch] gh/mikaylagawarecki/337/head -> origin/gh/mikaylagawarecki/337/head 2025-09-07T07:51:36.3404100Z * [new branch] gh/mikaylagawarecki/337/orig -> origin/gh/mikaylagawarecki/337/orig 2025-09-07T07:51:36.3406666Z * [new branch] gh/mikaylagawarecki/338/base -> origin/gh/mikaylagawarecki/338/base 2025-09-07T07:51:36.3408347Z * [new branch] gh/mikaylagawarecki/338/head -> origin/gh/mikaylagawarecki/338/head 2025-09-07T07:51:36.3409937Z * [new branch] gh/mikaylagawarecki/338/orig -> origin/gh/mikaylagawarecki/338/orig 2025-09-07T07:51:36.3412098Z * [new branch] gh/mikaylagawarecki/339/base -> origin/gh/mikaylagawarecki/339/base 2025-09-07T07:51:36.3413762Z * [new branch] gh/mikaylagawarecki/339/head -> origin/gh/mikaylagawarecki/339/head 2025-09-07T07:51:36.3415398Z * [new branch] gh/mikaylagawarecki/339/orig -> origin/gh/mikaylagawarecki/339/orig 2025-09-07T07:51:36.3418497Z * [new branch] gh/mlazos/1/base -> origin/gh/mlazos/1/base 2025-09-07T07:51:36.3420128Z * [new branch] gh/mlazos/1/head -> origin/gh/mlazos/1/head 2025-09-07T07:51:36.3421849Z * [new branch] gh/mlazos/1/orig -> origin/gh/mlazos/1/orig 2025-09-07T07:51:36.3424170Z * [new branch] gh/mlazos/12/base -> origin/gh/mlazos/12/base 2025-09-07T07:51:36.3426086Z * [new branch] gh/mlazos/12/head -> origin/gh/mlazos/12/head 2025-09-07T07:51:36.3428106Z * [new branch] gh/mlazos/12/orig -> origin/gh/mlazos/12/orig 2025-09-07T07:51:36.3430417Z * [new branch] gh/mlazos/13/base -> origin/gh/mlazos/13/base 2025-09-07T07:51:36.3431990Z * [new branch] gh/mlazos/13/head -> origin/gh/mlazos/13/head 2025-09-07T07:51:36.3433541Z * [new branch] gh/mlazos/13/orig -> origin/gh/mlazos/13/orig 2025-09-07T07:51:36.3436175Z * [new branch] gh/mlazos/14/base -> origin/gh/mlazos/14/base 2025-09-07T07:51:36.3437754Z * [new branch] gh/mlazos/14/head -> origin/gh/mlazos/14/head 2025-09-07T07:51:36.3439489Z * [new branch] gh/mlazos/14/orig -> origin/gh/mlazos/14/orig 2025-09-07T07:51:36.3441705Z * [new branch] gh/mlazos/15/base -> origin/gh/mlazos/15/base 2025-09-07T07:51:36.3443269Z * [new branch] gh/mlazos/15/head -> origin/gh/mlazos/15/head 2025-09-07T07:51:36.3444815Z * [new branch] gh/mlazos/15/orig -> origin/gh/mlazos/15/orig 2025-09-07T07:51:36.3447452Z * [new branch] gh/mlazos/16/base -> origin/gh/mlazos/16/base 2025-09-07T07:51:36.3449147Z * [new branch] gh/mlazos/16/head -> origin/gh/mlazos/16/head 2025-09-07T07:51:36.3450667Z * [new branch] gh/mlazos/16/orig -> origin/gh/mlazos/16/orig 2025-09-07T07:51:36.3452842Z * [new branch] gh/mlazos/17/base -> origin/gh/mlazos/17/base 2025-09-07T07:51:36.3454425Z * [new branch] gh/mlazos/17/head -> origin/gh/mlazos/17/head 2025-09-07T07:51:36.3456270Z * [new branch] gh/mlazos/17/orig -> origin/gh/mlazos/17/orig 2025-09-07T07:51:36.3458607Z * [new branch] gh/mlazos/2/base -> origin/gh/mlazos/2/base 2025-09-07T07:51:36.3460210Z * [new branch] gh/mlazos/2/head -> origin/gh/mlazos/2/head 2025-09-07T07:51:36.3461819Z * [new branch] gh/mlazos/2/orig -> origin/gh/mlazos/2/orig 2025-09-07T07:51:36.3464220Z * [new branch] gh/mlazos/3/base -> origin/gh/mlazos/3/base 2025-09-07T07:51:36.3465943Z * [new branch] gh/mlazos/3/head -> origin/gh/mlazos/3/head 2025-09-07T07:51:36.3467536Z * [new branch] gh/mlazos/3/orig -> origin/gh/mlazos/3/orig 2025-09-07T07:51:36.3470565Z * [new branch] gh/mrmiywj/1/base -> origin/gh/mrmiywj/1/base 2025-09-07T07:51:36.3472266Z * [new branch] gh/mrmiywj/1/head -> origin/gh/mrmiywj/1/head 2025-09-07T07:51:36.3475294Z * [new branch] gh/muchulee8/62/base -> origin/gh/muchulee8/62/base 2025-09-07T07:51:36.3477249Z * [new branch] gh/muchulee8/62/head -> origin/gh/muchulee8/62/head 2025-09-07T07:51:36.3478833Z * [new branch] gh/muchulee8/62/orig -> origin/gh/muchulee8/62/orig 2025-09-07T07:51:36.3481263Z * [new branch] gh/muchulee8/63/base -> origin/gh/muchulee8/63/base 2025-09-07T07:51:36.3482837Z * [new branch] gh/muchulee8/63/head -> origin/gh/muchulee8/63/head 2025-09-07T07:51:36.3484458Z * [new branch] gh/muchulee8/63/orig -> origin/gh/muchulee8/63/orig 2025-09-07T07:51:36.3487216Z * [new branch] gh/muchulee8/64/base -> origin/gh/muchulee8/64/base 2025-09-07T07:51:36.3488791Z * [new branch] gh/muchulee8/64/head -> origin/gh/muchulee8/64/head 2025-09-07T07:51:36.3490404Z * [new branch] gh/muchulee8/64/orig -> origin/gh/muchulee8/64/orig 2025-09-07T07:51:36.3492850Z * [new branch] gh/muchulee8/65/base -> origin/gh/muchulee8/65/base 2025-09-07T07:51:36.3494551Z * [new branch] gh/muchulee8/65/head -> origin/gh/muchulee8/65/head 2025-09-07T07:51:36.3496611Z * [new branch] gh/muchulee8/65/orig -> origin/gh/muchulee8/65/orig 2025-09-07T07:51:36.3499478Z * [new branch] gh/naveenthangudu/1/base -> origin/gh/naveenthangudu/1/base 2025-09-07T07:51:36.3501092Z * [new branch] gh/naveenthangudu/1/head -> origin/gh/naveenthangudu/1/head 2025-09-07T07:51:36.3502865Z * [new branch] gh/naveenthangudu/1/orig -> origin/gh/naveenthangudu/1/orig 2025-09-07T07:51:36.3505262Z * [new branch] gh/naveenthangudu/2/base -> origin/gh/naveenthangudu/2/base 2025-09-07T07:51:36.3506941Z * [new branch] gh/naveenthangudu/2/head -> origin/gh/naveenthangudu/2/head 2025-09-07T07:51:36.3508769Z * [new branch] gh/naveenthangudu/2/orig -> origin/gh/naveenthangudu/2/orig 2025-09-07T07:51:36.3510900Z * [new branch] gh/naveenthangudu/3/base -> origin/gh/naveenthangudu/3/base 2025-09-07T07:51:36.3512443Z * [new branch] gh/naveenthangudu/3/head -> origin/gh/naveenthangudu/3/head 2025-09-07T07:51:36.3514103Z * [new branch] gh/naveenthangudu/3/orig -> origin/gh/naveenthangudu/3/orig 2025-09-07T07:51:36.3516591Z * [new branch] gh/naveenthangudu/4/base -> origin/gh/naveenthangudu/4/base 2025-09-07T07:51:36.3518190Z * [new branch] gh/naveenthangudu/4/head -> origin/gh/naveenthangudu/4/head 2025-09-07T07:51:36.3519805Z * [new branch] gh/naveenthangudu/4/orig -> origin/gh/naveenthangudu/4/orig 2025-09-07T07:51:36.3522087Z * [new branch] gh/naveenthangudu/5/base -> origin/gh/naveenthangudu/5/base 2025-09-07T07:51:36.3523688Z * [new branch] gh/naveenthangudu/5/head -> origin/gh/naveenthangudu/5/head 2025-09-07T07:51:36.3525517Z * [new branch] gh/naveenthangudu/5/orig -> origin/gh/naveenthangudu/5/orig 2025-09-07T07:51:36.3527935Z * [new branch] gh/naveenthangudu/6/base -> origin/gh/naveenthangudu/6/base 2025-09-07T07:51:36.3529487Z * [new branch] gh/naveenthangudu/6/head -> origin/gh/naveenthangudu/6/head 2025-09-07T07:51:36.3531075Z * [new branch] gh/naveenthangudu/6/orig -> origin/gh/naveenthangudu/6/orig 2025-09-07T07:51:36.3533840Z * [new branch] gh/oulgen/35/base -> origin/gh/oulgen/35/base 2025-09-07T07:51:36.3535648Z * [new branch] gh/oulgen/35/head -> origin/gh/oulgen/35/head 2025-09-07T07:51:36.3537223Z * [new branch] gh/oulgen/35/orig -> origin/gh/oulgen/35/orig 2025-09-07T07:51:36.3539319Z * [new branch] gh/oulgen/48/base -> origin/gh/oulgen/48/base 2025-09-07T07:51:36.3540863Z * [new branch] gh/oulgen/48/head -> origin/gh/oulgen/48/head 2025-09-07T07:51:36.3542520Z * [new branch] gh/oulgen/48/orig -> origin/gh/oulgen/48/orig 2025-09-07T07:51:36.3544600Z * [new branch] gh/oulgen/49/base -> origin/gh/oulgen/49/base 2025-09-07T07:51:36.3546425Z * [new branch] gh/oulgen/49/head -> origin/gh/oulgen/49/head 2025-09-07T07:51:36.3548100Z * [new branch] gh/oulgen/49/orig -> origin/gh/oulgen/49/orig 2025-09-07T07:51:36.3550966Z * [new branch] gh/pearu/108/base -> origin/gh/pearu/108/base 2025-09-07T07:51:36.3552559Z * [new branch] gh/pearu/108/head -> origin/gh/pearu/108/head 2025-09-07T07:51:36.3554197Z * [new branch] gh/pearu/108/orig -> origin/gh/pearu/108/orig 2025-09-07T07:51:36.3556722Z * [new branch] gh/pearu/109/base -> origin/gh/pearu/109/base 2025-09-07T07:51:36.3558275Z * [new branch] gh/pearu/109/head -> origin/gh/pearu/109/head 2025-09-07T07:51:36.3559767Z * [new branch] gh/pearu/109/orig -> origin/gh/pearu/109/orig 2025-09-07T07:51:36.3561888Z * [new branch] gh/pearu/110/base -> origin/gh/pearu/110/base 2025-09-07T07:51:36.3563457Z * [new branch] gh/pearu/110/head -> origin/gh/pearu/110/head 2025-09-07T07:51:36.3565211Z * [new branch] gh/pearu/110/orig -> origin/gh/pearu/110/orig 2025-09-07T07:51:36.3567862Z * [new branch] gh/pearu/111/base -> origin/gh/pearu/111/base 2025-09-07T07:51:36.3569387Z * [new branch] gh/pearu/111/head -> origin/gh/pearu/111/head 2025-09-07T07:51:36.3570902Z * [new branch] gh/pearu/111/orig -> origin/gh/pearu/111/orig 2025-09-07T07:51:36.3573005Z * [new branch] gh/pearu/112/base -> origin/gh/pearu/112/base 2025-09-07T07:51:36.3574841Z * [new branch] gh/pearu/112/head -> origin/gh/pearu/112/head 2025-09-07T07:51:36.3576599Z * [new branch] gh/pearu/112/orig -> origin/gh/pearu/112/orig 2025-09-07T07:51:36.3578743Z * [new branch] gh/pearu/113/base -> origin/gh/pearu/113/base 2025-09-07T07:51:36.3580340Z * [new branch] gh/pearu/113/head -> origin/gh/pearu/113/head 2025-09-07T07:51:36.3582040Z * [new branch] gh/pearu/113/orig -> origin/gh/pearu/113/orig 2025-09-07T07:51:36.3584268Z * [new branch] gh/pearu/114/base -> origin/gh/pearu/114/base 2025-09-07T07:51:36.3586092Z * [new branch] gh/pearu/114/head -> origin/gh/pearu/114/head 2025-09-07T07:51:36.3587695Z * [new branch] gh/pearu/114/orig -> origin/gh/pearu/114/orig 2025-09-07T07:51:36.3589903Z * [new branch] gh/pearu/115/base -> origin/gh/pearu/115/base 2025-09-07T07:51:36.3591505Z * [new branch] gh/pearu/115/head -> origin/gh/pearu/115/head 2025-09-07T07:51:36.3593002Z * [new branch] gh/pearu/115/orig -> origin/gh/pearu/115/orig 2025-09-07T07:51:36.3595302Z * [new branch] gh/pearu/116/base -> origin/gh/pearu/116/base 2025-09-07T07:51:36.3596979Z * [new branch] gh/pearu/116/head -> origin/gh/pearu/116/head 2025-09-07T07:51:36.3598496Z * [new branch] gh/pearu/116/orig -> origin/gh/pearu/116/orig 2025-09-07T07:51:36.3600678Z * [new branch] gh/pearu/117/base -> origin/gh/pearu/117/base 2025-09-07T07:51:36.3602216Z * [new branch] gh/pearu/117/head -> origin/gh/pearu/117/head 2025-09-07T07:51:36.3603766Z * [new branch] gh/pearu/117/orig -> origin/gh/pearu/117/orig 2025-09-07T07:51:36.3606678Z * [new branch] gh/pearu/56/base -> origin/gh/pearu/56/base 2025-09-07T07:51:36.3608410Z * [new branch] gh/pearu/56/head -> origin/gh/pearu/56/head 2025-09-07T07:51:36.3609881Z * [new branch] gh/pearu/56/orig -> origin/gh/pearu/56/orig 2025-09-07T07:51:36.3612229Z * [new branch] gh/pearu/97/base -> origin/gh/pearu/97/base 2025-09-07T07:51:36.3613946Z * [new branch] gh/pearu/97/head -> origin/gh/pearu/97/head 2025-09-07T07:51:36.3615664Z * [new branch] gh/pearu/97/orig -> origin/gh/pearu/97/orig 2025-09-07T07:51:36.3618439Z * [new branch] gh/qqaatw/29/base -> origin/gh/qqaatw/29/base 2025-09-07T07:51:36.3619979Z * [new branch] gh/qqaatw/29/head -> origin/gh/qqaatw/29/head 2025-09-07T07:51:36.3621671Z * [new branch] gh/qqaatw/29/orig -> origin/gh/qqaatw/29/orig 2025-09-07T07:51:36.3624012Z * [new branch] gh/raymo/refresh-script -> origin/gh/raymo/refresh-script 2025-09-07T07:51:36.3627060Z * [new branch] gh/rec/141/base -> origin/gh/rec/141/base 2025-09-07T07:51:36.3628569Z * [new branch] gh/rec/141/head -> origin/gh/rec/141/head 2025-09-07T07:51:36.3630722Z * [new branch] gh/rec/153/base -> origin/gh/rec/153/base 2025-09-07T07:51:36.3632272Z * [new branch] gh/rec/153/head -> origin/gh/rec/153/head 2025-09-07T07:51:36.3633875Z * [new branch] gh/rec/153/orig -> origin/gh/rec/153/orig 2025-09-07T07:51:36.3636282Z * [new branch] gh/rec/154/base -> origin/gh/rec/154/base 2025-09-07T07:51:36.3637792Z * [new branch] gh/rec/154/head -> origin/gh/rec/154/head 2025-09-07T07:51:36.3639329Z * [new branch] gh/rec/154/orig -> origin/gh/rec/154/orig 2025-09-07T07:51:36.3641526Z * [new branch] gh/rec/156/base -> origin/gh/rec/156/base 2025-09-07T07:51:36.3643264Z * [new branch] gh/rec/156/head -> origin/gh/rec/156/head 2025-09-07T07:51:36.3644626Z * [new branch] gh/rec/156/orig -> origin/gh/rec/156/orig 2025-09-07T07:51:36.3647122Z * [new branch] gh/rec/160/base -> origin/gh/rec/160/base 2025-09-07T07:51:36.3648718Z * [new branch] gh/rec/160/head -> origin/gh/rec/160/head 2025-09-07T07:51:36.3650140Z * [new branch] gh/rec/160/orig -> origin/gh/rec/160/orig 2025-09-07T07:51:36.3652296Z * [new branch] gh/rec/162/base -> origin/gh/rec/162/base 2025-09-07T07:51:36.3653937Z * [new branch] gh/rec/162/head -> origin/gh/rec/162/head 2025-09-07T07:51:36.3655729Z * [new branch] gh/rec/162/orig -> origin/gh/rec/162/orig 2025-09-07T07:51:36.3657972Z * [new branch] gh/rec/163/base -> origin/gh/rec/163/base 2025-09-07T07:51:36.3659480Z * [new branch] gh/rec/163/head -> origin/gh/rec/163/head 2025-09-07T07:51:36.3660989Z * [new branch] gh/rec/163/orig -> origin/gh/rec/163/orig 2025-09-07T07:51:36.3663355Z * [new branch] gh/rec/164/base -> origin/gh/rec/164/base 2025-09-07T07:51:36.3664870Z * [new branch] gh/rec/164/head -> origin/gh/rec/164/head 2025-09-07T07:51:36.3666845Z * [new branch] gh/rec/164/orig -> origin/gh/rec/164/orig 2025-09-07T07:51:36.3669088Z * [new branch] gh/rec/165/base -> origin/gh/rec/165/base 2025-09-07T07:51:36.3670836Z * [new branch] gh/rec/165/head -> origin/gh/rec/165/head 2025-09-07T07:51:36.3672339Z * [new branch] gh/rec/165/orig -> origin/gh/rec/165/orig 2025-09-07T07:51:36.3674466Z * [new branch] gh/rec/166/base -> origin/gh/rec/166/base 2025-09-07T07:51:36.3676397Z * [new branch] gh/rec/166/head -> origin/gh/rec/166/head 2025-09-07T07:51:36.3677884Z * [new branch] gh/rec/166/orig -> origin/gh/rec/166/orig 2025-09-07T07:51:36.3680753Z * [new branch] gh/robert-hardwick/1/base -> origin/gh/robert-hardwick/1/base 2025-09-07T07:51:36.3682398Z * [new branch] gh/robert-hardwick/1/head -> origin/gh/robert-hardwick/1/head 2025-09-07T07:51:36.3683973Z * [new branch] gh/robert-hardwick/1/orig -> origin/gh/robert-hardwick/1/orig 2025-09-07T07:51:36.3686521Z * [new branch] gh/robert-hardwick/2/base -> origin/gh/robert-hardwick/2/base 2025-09-07T07:51:36.3688146Z * [new branch] gh/robert-hardwick/2/head -> origin/gh/robert-hardwick/2/head 2025-09-07T07:51:36.3689604Z * [new branch] gh/robert-hardwick/2/orig -> origin/gh/robert-hardwick/2/orig 2025-09-07T07:51:36.3691783Z * [new branch] gh/robert-hardwick/3/base -> origin/gh/robert-hardwick/3/base 2025-09-07T07:51:36.3693546Z * [new branch] gh/robert-hardwick/3/head -> origin/gh/robert-hardwick/3/head 2025-09-07T07:51:36.3695076Z * [new branch] gh/robert-hardwick/3/orig -> origin/gh/robert-hardwick/3/orig 2025-09-07T07:51:36.3697507Z * [new branch] gh/robert-hardwick/4/base -> origin/gh/robert-hardwick/4/base 2025-09-07T07:51:36.3699096Z * [new branch] gh/robert-hardwick/4/head -> origin/gh/robert-hardwick/4/head 2025-09-07T07:51:36.3700554Z * [new branch] gh/robert-hardwick/4/orig -> origin/gh/robert-hardwick/4/orig 2025-09-07T07:51:36.3703446Z * [new branch] gh/rtimpe/1/base -> origin/gh/rtimpe/1/base 2025-09-07T07:51:36.3705130Z * [new branch] gh/rtimpe/1/head -> origin/gh/rtimpe/1/head 2025-09-07T07:51:36.3707433Z * [new branch] gh/rtimpe/10/base -> origin/gh/rtimpe/10/base 2025-09-07T07:51:36.3709008Z * [new branch] gh/rtimpe/10/head -> origin/gh/rtimpe/10/head 2025-09-07T07:51:36.3710672Z * [new branch] gh/rtimpe/10/orig -> origin/gh/rtimpe/10/orig 2025-09-07T07:51:36.3712829Z * [new branch] gh/rtimpe/11/base -> origin/gh/rtimpe/11/base 2025-09-07T07:51:36.3714432Z * [new branch] gh/rtimpe/11/head -> origin/gh/rtimpe/11/head 2025-09-07T07:51:36.3716276Z * [new branch] gh/rtimpe/11/orig -> origin/gh/rtimpe/11/orig 2025-09-07T07:51:36.3718504Z * [new branch] gh/rtimpe/12/base -> origin/gh/rtimpe/12/base 2025-09-07T07:51:36.3719991Z * [new branch] gh/rtimpe/12/head -> origin/gh/rtimpe/12/head 2025-09-07T07:51:36.3721565Z * [new branch] gh/rtimpe/12/orig -> origin/gh/rtimpe/12/orig 2025-09-07T07:51:36.3723816Z * [new branch] gh/rtimpe/13/base -> origin/gh/rtimpe/13/base 2025-09-07T07:51:36.3725657Z * [new branch] gh/rtimpe/13/head -> origin/gh/rtimpe/13/head 2025-09-07T07:51:36.3727231Z * [new branch] gh/rtimpe/13/orig -> origin/gh/rtimpe/13/orig 2025-09-07T07:51:36.3729430Z * [new branch] gh/rtimpe/14/base -> origin/gh/rtimpe/14/base 2025-09-07T07:51:36.3731025Z * [new branch] gh/rtimpe/14/head -> origin/gh/rtimpe/14/head 2025-09-07T07:51:36.3732471Z * [new branch] gh/rtimpe/14/orig -> origin/gh/rtimpe/14/orig 2025-09-07T07:51:36.3734690Z * [new branch] gh/rtimpe/15/base -> origin/gh/rtimpe/15/base 2025-09-07T07:51:36.3736633Z * [new branch] gh/rtimpe/15/head -> origin/gh/rtimpe/15/head 2025-09-07T07:51:36.3738092Z * [new branch] gh/rtimpe/15/orig -> origin/gh/rtimpe/15/orig 2025-09-07T07:51:36.3740156Z * [new branch] gh/rtimpe/2/base -> origin/gh/rtimpe/2/base 2025-09-07T07:51:36.3741796Z * [new branch] gh/rtimpe/2/head -> origin/gh/rtimpe/2/head 2025-09-07T07:51:36.3743972Z * [new branch] gh/rtimpe/3/base -> origin/gh/rtimpe/3/base 2025-09-07T07:51:36.3745738Z * [new branch] gh/rtimpe/3/head -> origin/gh/rtimpe/3/head 2025-09-07T07:51:36.3747956Z * [new branch] gh/rtimpe/4/base -> origin/gh/rtimpe/4/base 2025-09-07T07:51:36.3749551Z * [new branch] gh/rtimpe/4/head -> origin/gh/rtimpe/4/head 2025-09-07T07:51:36.3751828Z * [new branch] gh/rtimpe/9/base -> origin/gh/rtimpe/9/base 2025-09-07T07:51:36.3753329Z * [new branch] gh/rtimpe/9/head -> origin/gh/rtimpe/9/head 2025-09-07T07:51:36.3754873Z * [new branch] gh/rtimpe/9/orig -> origin/gh/rtimpe/9/orig 2025-09-07T07:51:36.3758137Z * [new branch] gh/ruisizhang123/1/base -> origin/gh/ruisizhang123/1/base 2025-09-07T07:51:36.3759668Z * [new branch] gh/ruisizhang123/1/head -> origin/gh/ruisizhang123/1/head 2025-09-07T07:51:36.3761184Z * [new branch] gh/ruisizhang123/1/orig -> origin/gh/ruisizhang123/1/orig 2025-09-07T07:51:36.3763375Z * [new branch] gh/ruisizhang123/4/base -> origin/gh/ruisizhang123/4/base 2025-09-07T07:51:36.3765296Z * [new branch] gh/ruisizhang123/4/head -> origin/gh/ruisizhang123/4/head 2025-09-07T07:51:36.3767029Z * [new branch] gh/ruisizhang123/4/orig -> origin/gh/ruisizhang123/4/orig 2025-09-07T07:51:36.3769108Z * [new branch] gh/ruisizhang123/5/base -> origin/gh/ruisizhang123/5/base 2025-09-07T07:51:36.3770736Z * [new branch] gh/ruisizhang123/5/head -> origin/gh/ruisizhang123/5/head 2025-09-07T07:51:36.3772260Z * [new branch] gh/ruisizhang123/5/orig -> origin/gh/ruisizhang123/5/orig 2025-09-07T07:51:36.3774449Z * [new branch] gh/ruisizhang123/6/base -> origin/gh/ruisizhang123/6/base 2025-09-07T07:51:36.3776446Z * [new branch] gh/ruisizhang123/6/head -> origin/gh/ruisizhang123/6/head 2025-09-07T07:51:36.3777900Z * [new branch] gh/ruisizhang123/6/orig -> origin/gh/ruisizhang123/6/orig 2025-09-07T07:51:36.3780089Z * [new branch] gh/ruisizhang123/7/base -> origin/gh/ruisizhang123/7/base 2025-09-07T07:51:36.3781761Z * [new branch] gh/ruisizhang123/7/head -> origin/gh/ruisizhang123/7/head 2025-09-07T07:51:36.3783378Z * [new branch] gh/ruisizhang123/7/orig -> origin/gh/ruisizhang123/7/orig 2025-09-07T07:51:36.3785755Z * [new branch] gh/ruisizhang123/8/base -> origin/gh/ruisizhang123/8/base 2025-09-07T07:51:36.3787325Z * [new branch] gh/ruisizhang123/8/head -> origin/gh/ruisizhang123/8/head 2025-09-07T07:51:36.3788867Z * [new branch] gh/ruisizhang123/8/orig -> origin/gh/ruisizhang123/8/orig 2025-09-07T07:51:36.3791025Z * [new branch] gh/ruisizhang123/9/base -> origin/gh/ruisizhang123/9/base 2025-09-07T07:51:36.3792589Z * [new branch] gh/ruisizhang123/9/head -> origin/gh/ruisizhang123/9/head 2025-09-07T07:51:36.3794187Z * [new branch] gh/ruisizhang123/9/orig -> origin/gh/ruisizhang123/9/orig 2025-09-07T07:51:36.3797290Z * [new branch] gh/sarckk/2/base -> origin/gh/sarckk/2/base 2025-09-07T07:51:36.3798838Z * [new branch] gh/sarckk/2/head -> origin/gh/sarckk/2/head 2025-09-07T07:51:36.3800344Z * [new branch] gh/sarckk/2/orig -> origin/gh/sarckk/2/orig 2025-09-07T07:51:36.3803132Z * [new branch] gh/seemethere/35/base -> origin/gh/seemethere/35/base 2025-09-07T07:51:36.3804745Z * [new branch] gh/seemethere/35/head -> origin/gh/seemethere/35/head 2025-09-07T07:51:36.3806647Z * [new branch] gh/seemethere/35/orig -> origin/gh/seemethere/35/orig 2025-09-07T07:51:36.3808822Z * [new branch] gh/seemethere/37/base -> origin/gh/seemethere/37/base 2025-09-07T07:51:36.3810374Z * [new branch] gh/seemethere/37/head -> origin/gh/seemethere/37/head 2025-09-07T07:51:36.3811970Z * [new branch] gh/seemethere/37/orig -> origin/gh/seemethere/37/orig 2025-09-07T07:51:36.3814085Z * [new branch] gh/seemethere/43/base -> origin/gh/seemethere/43/base 2025-09-07T07:51:36.3815943Z * [new branch] gh/seemethere/43/head -> origin/gh/seemethere/43/head 2025-09-07T07:51:36.3817394Z * [new branch] gh/seemethere/43/orig -> origin/gh/seemethere/43/orig 2025-09-07T07:51:36.3819550Z * [new branch] gh/seemethere/44/base -> origin/gh/seemethere/44/base 2025-09-07T07:51:36.3821165Z * [new branch] gh/seemethere/44/head -> origin/gh/seemethere/44/head 2025-09-07T07:51:36.3822834Z * [new branch] gh/seemethere/44/orig -> origin/gh/seemethere/44/orig 2025-09-07T07:51:36.3825153Z * [new branch] gh/seemethere/48/base -> origin/gh/seemethere/48/base 2025-09-07T07:51:36.3826796Z * [new branch] gh/seemethere/48/head -> origin/gh/seemethere/48/head 2025-09-07T07:51:36.3828376Z * [new branch] gh/seemethere/48/orig -> origin/gh/seemethere/48/orig 2025-09-07T07:51:36.3830504Z * [new branch] gh/seemethere/49/base -> origin/gh/seemethere/49/base 2025-09-07T07:51:36.3832089Z * [new branch] gh/seemethere/49/head -> origin/gh/seemethere/49/head 2025-09-07T07:51:36.3833743Z * [new branch] gh/seemethere/49/orig -> origin/gh/seemethere/49/orig 2025-09-07T07:51:36.3836239Z * [new branch] gh/seemethere/52/base -> origin/gh/seemethere/52/base 2025-09-07T07:51:36.3837815Z * [new branch] gh/seemethere/52/head -> origin/gh/seemethere/52/head 2025-09-07T07:51:36.3839347Z * [new branch] gh/seemethere/52/orig -> origin/gh/seemethere/52/orig 2025-09-07T07:51:36.3841656Z * [new branch] gh/seemethere/53/base -> origin/gh/seemethere/53/base 2025-09-07T07:51:36.3843086Z * [new branch] gh/seemethere/53/head -> origin/gh/seemethere/53/head 2025-09-07T07:51:36.3844567Z * [new branch] gh/seemethere/53/orig -> origin/gh/seemethere/53/orig 2025-09-07T07:51:36.3847259Z * [new branch] gh/seemethere/54/base -> origin/gh/seemethere/54/base 2025-09-07T07:51:36.3848867Z * [new branch] gh/seemethere/54/head -> origin/gh/seemethere/54/head 2025-09-07T07:51:36.3850412Z * [new branch] gh/seemethere/54/orig -> origin/gh/seemethere/54/orig 2025-09-07T07:51:36.3852491Z * [new branch] gh/seemethere/55/base -> origin/gh/seemethere/55/base 2025-09-07T07:51:36.3853968Z * [new branch] gh/seemethere/55/head -> origin/gh/seemethere/55/head 2025-09-07T07:51:36.3855717Z * [new branch] gh/seemethere/55/orig -> origin/gh/seemethere/55/orig 2025-09-07T07:51:36.3858216Z * [new branch] gh/seemethere/56/base -> origin/gh/seemethere/56/base 2025-09-07T07:51:36.3859822Z * [new branch] gh/seemethere/56/head -> origin/gh/seemethere/56/head 2025-09-07T07:51:36.3861414Z * [new branch] gh/seemethere/56/orig -> origin/gh/seemethere/56/orig 2025-09-07T07:51:36.3863717Z * [new branch] gh/seemethere/57/base -> origin/gh/seemethere/57/base 2025-09-07T07:51:36.3865483Z * [new branch] gh/seemethere/57/head -> origin/gh/seemethere/57/head 2025-09-07T07:51:36.3867044Z * [new branch] gh/seemethere/57/orig -> origin/gh/seemethere/57/orig 2025-09-07T07:51:36.3869222Z * [new branch] gh/seemethere/58/base -> origin/gh/seemethere/58/base 2025-09-07T07:51:36.3870767Z * [new branch] gh/seemethere/58/head -> origin/gh/seemethere/58/head 2025-09-07T07:51:36.3872320Z * [new branch] gh/seemethere/58/orig -> origin/gh/seemethere/58/orig 2025-09-07T07:51:36.3874573Z * [new branch] gh/seemethere/59/base -> origin/gh/seemethere/59/base 2025-09-07T07:51:36.3876390Z * [new branch] gh/seemethere/59/head -> origin/gh/seemethere/59/head 2025-09-07T07:51:36.3877827Z * [new branch] gh/seemethere/59/orig -> origin/gh/seemethere/59/orig 2025-09-07T07:51:36.3879955Z * [new branch] gh/seemethere/60/base -> origin/gh/seemethere/60/base 2025-09-07T07:51:36.3881570Z * [new branch] gh/seemethere/60/head -> origin/gh/seemethere/60/head 2025-09-07T07:51:36.3883108Z * [new branch] gh/seemethere/60/orig -> origin/gh/seemethere/60/orig 2025-09-07T07:51:36.3885302Z * [new branch] gh/seemethere/61/base -> origin/gh/seemethere/61/base 2025-09-07T07:51:36.3887063Z * [new branch] gh/seemethere/61/head -> origin/gh/seemethere/61/head 2025-09-07T07:51:36.3888586Z * [new branch] gh/seemethere/61/orig -> origin/gh/seemethere/61/orig 2025-09-07T07:51:36.3890737Z * [new branch] gh/seemethere/62/base -> origin/gh/seemethere/62/base 2025-09-07T07:51:36.3892350Z * [new branch] gh/seemethere/62/head -> origin/gh/seemethere/62/head 2025-09-07T07:51:36.3893922Z * [new branch] gh/seemethere/62/orig -> origin/gh/seemethere/62/orig 2025-09-07T07:51:36.3896375Z * [new branch] gh/seemethere/63/base -> origin/gh/seemethere/63/base 2025-09-07T07:51:36.3897932Z * [new branch] gh/seemethere/63/head -> origin/gh/seemethere/63/head 2025-09-07T07:51:36.3899406Z * [new branch] gh/seemethere/63/orig -> origin/gh/seemethere/63/orig 2025-09-07T07:51:36.3902532Z * [new branch] gh/shunting314/145/base -> origin/gh/shunting314/145/base 2025-09-07T07:51:36.3904168Z * [new branch] gh/shunting314/145/head -> origin/gh/shunting314/145/head 2025-09-07T07:51:36.3906212Z * [new branch] gh/shunting314/145/orig -> origin/gh/shunting314/145/orig 2025-09-07T07:51:36.3908532Z * [new branch] gh/shunting314/176/base -> origin/gh/shunting314/176/base 2025-09-07T07:51:36.3910250Z * [new branch] gh/shunting314/176/head -> origin/gh/shunting314/176/head 2025-09-07T07:51:36.3911887Z * [new branch] gh/shunting314/176/orig -> origin/gh/shunting314/176/orig 2025-09-07T07:51:36.3914090Z * [new branch] gh/shunting314/211/base -> origin/gh/shunting314/211/base 2025-09-07T07:51:36.3915840Z * [new branch] gh/shunting314/211/head -> origin/gh/shunting314/211/head 2025-09-07T07:51:36.3917386Z * [new branch] gh/shunting314/211/orig -> origin/gh/shunting314/211/orig 2025-09-07T07:51:36.3919461Z * [new branch] gh/shunting314/212/base -> origin/gh/shunting314/212/base 2025-09-07T07:51:36.3920980Z * [new branch] gh/shunting314/212/head -> origin/gh/shunting314/212/head 2025-09-07T07:51:36.3922468Z * [new branch] gh/shunting314/212/orig -> origin/gh/shunting314/212/orig 2025-09-07T07:51:36.3924652Z * [new branch] gh/shunting314/213/base -> origin/gh/shunting314/213/base 2025-09-07T07:51:36.3926635Z * [new branch] gh/shunting314/213/head -> origin/gh/shunting314/213/head 2025-09-07T07:51:36.3928085Z * [new branch] gh/shunting314/213/orig -> origin/gh/shunting314/213/orig 2025-09-07T07:51:36.3930313Z * [new branch] gh/shunting314/214/base -> origin/gh/shunting314/214/base 2025-09-07T07:51:36.3931896Z * [new branch] gh/shunting314/214/head -> origin/gh/shunting314/214/head 2025-09-07T07:51:36.3933451Z * [new branch] gh/shunting314/214/orig -> origin/gh/shunting314/214/orig 2025-09-07T07:51:36.3936207Z * [new branch] gh/shunting314/215/base -> origin/gh/shunting314/215/base 2025-09-07T07:51:36.3937734Z * [new branch] gh/shunting314/215/head -> origin/gh/shunting314/215/head 2025-09-07T07:51:36.3939174Z * [new branch] gh/shunting314/215/orig -> origin/gh/shunting314/215/orig 2025-09-07T07:51:36.3941286Z * [new branch] gh/shunting314/216/base -> origin/gh/shunting314/216/base 2025-09-07T07:51:36.3942964Z * [new branch] gh/shunting314/216/head -> origin/gh/shunting314/216/head 2025-09-07T07:51:36.3944513Z * [new branch] gh/shunting314/216/orig -> origin/gh/shunting314/216/orig 2025-09-07T07:51:36.3946999Z * [new branch] gh/shunting314/217/base -> origin/gh/shunting314/217/base 2025-09-07T07:51:36.3948588Z * [new branch] gh/shunting314/217/head -> origin/gh/shunting314/217/head 2025-09-07T07:51:36.3950255Z * [new branch] gh/shunting314/217/orig -> origin/gh/shunting314/217/orig 2025-09-07T07:51:36.3952559Z * [new branch] gh/shunting314/218/base -> origin/gh/shunting314/218/base 2025-09-07T07:51:36.3954101Z * [new branch] gh/shunting314/218/head -> origin/gh/shunting314/218/head 2025-09-07T07:51:36.3955861Z * [new branch] gh/shunting314/218/orig -> origin/gh/shunting314/218/orig 2025-09-07T07:51:36.3957986Z * [new branch] gh/shunting314/219/base -> origin/gh/shunting314/219/base 2025-09-07T07:51:36.3959530Z * [new branch] gh/shunting314/219/head -> origin/gh/shunting314/219/head 2025-09-07T07:51:36.3961055Z * [new branch] gh/shunting314/219/orig -> origin/gh/shunting314/219/orig 2025-09-07T07:51:36.3963301Z * [new branch] gh/shunting314/220/base -> origin/gh/shunting314/220/base 2025-09-07T07:51:36.3965142Z * [new branch] gh/shunting314/220/head -> origin/gh/shunting314/220/head 2025-09-07T07:51:36.3966856Z * [new branch] gh/shunting314/220/orig -> origin/gh/shunting314/220/orig 2025-09-07T07:51:36.3969250Z * [new branch] gh/shunting314/221/base -> origin/gh/shunting314/221/base 2025-09-07T07:51:36.3970582Z * [new branch] gh/shunting314/221/head -> origin/gh/shunting314/221/head 2025-09-07T07:51:36.3972054Z * [new branch] gh/shunting314/221/orig -> origin/gh/shunting314/221/orig 2025-09-07T07:51:36.3974113Z * [new branch] gh/shunting314/222/base -> origin/gh/shunting314/222/base 2025-09-07T07:51:36.3976036Z * [new branch] gh/shunting314/222/head -> origin/gh/shunting314/222/head 2025-09-07T07:51:36.3977541Z * [new branch] gh/shunting314/222/orig -> origin/gh/shunting314/222/orig 2025-09-07T07:51:36.3979613Z * [new branch] gh/shunting314/223/base -> origin/gh/shunting314/223/base 2025-09-07T07:51:36.3981170Z * [new branch] gh/shunting314/223/head -> origin/gh/shunting314/223/head 2025-09-07T07:51:36.3982847Z * [new branch] gh/shunting314/223/orig -> origin/gh/shunting314/223/orig 2025-09-07T07:51:36.3986023Z * [new branch] gh/silverguo/1/base -> origin/gh/silverguo/1/base 2025-09-07T07:51:36.3987656Z * [new branch] gh/silverguo/1/head -> origin/gh/silverguo/1/head 2025-09-07T07:51:36.3989815Z * [new branch] gh/silverguo/2/base -> origin/gh/silverguo/2/base 2025-09-07T07:51:36.3991397Z * [new branch] gh/silverguo/2/head -> origin/gh/silverguo/2/head 2025-09-07T07:51:36.3993432Z * [new branch] gh/silverguo/3/base -> origin/gh/silverguo/3/base 2025-09-07T07:51:36.3995078Z * [new branch] gh/silverguo/3/head -> origin/gh/silverguo/3/head 2025-09-07T07:51:36.3997394Z * [new branch] gh/silverguo/4/base -> origin/gh/silverguo/4/base 2025-09-07T07:51:36.3999000Z * [new branch] gh/silverguo/4/head -> origin/gh/silverguo/4/head 2025-09-07T07:51:36.4001799Z * [new branch] gh/sinhaanhsul/1/base -> origin/gh/sinhaanhsul/1/base 2025-09-07T07:51:36.4003458Z * [new branch] gh/sinhaanhsul/1/head -> origin/gh/sinhaanhsul/1/head 2025-09-07T07:51:36.4006624Z * [new branch] gh/skarjala/17/base -> origin/gh/skarjala/17/base 2025-09-07T07:51:36.4008205Z * [new branch] gh/skarjala/17/head -> origin/gh/skarjala/17/head 2025-09-07T07:51:36.4009855Z * [new branch] gh/skarjala/17/orig -> origin/gh/skarjala/17/orig 2025-09-07T07:51:36.4012131Z * [new branch] gh/skarjala/18/base -> origin/gh/skarjala/18/base 2025-09-07T07:51:36.4013712Z * [new branch] gh/skarjala/18/head -> origin/gh/skarjala/18/head 2025-09-07T07:51:36.4015497Z * [new branch] gh/skarjala/18/orig -> origin/gh/skarjala/18/orig 2025-09-07T07:51:36.4017803Z * [new branch] gh/skarjala/19/base -> origin/gh/skarjala/19/base 2025-09-07T07:51:36.4019456Z * [new branch] gh/skarjala/19/head -> origin/gh/skarjala/19/head 2025-09-07T07:51:36.4021085Z * [new branch] gh/skarjala/19/orig -> origin/gh/skarjala/19/orig 2025-09-07T07:51:36.4024068Z * [new branch] gh/slayton58/1/base -> origin/gh/slayton58/1/base 2025-09-07T07:51:36.4026026Z * [new branch] gh/slayton58/1/head -> origin/gh/slayton58/1/head 2025-09-07T07:51:36.4027608Z * [new branch] gh/slayton58/1/orig -> origin/gh/slayton58/1/orig 2025-09-07T07:51:36.4029856Z * [new branch] gh/slayton58/2/base -> origin/gh/slayton58/2/base 2025-09-07T07:51:36.4031479Z * [new branch] gh/slayton58/2/head -> origin/gh/slayton58/2/head 2025-09-07T07:51:36.4033056Z * [new branch] gh/slayton58/2/orig -> origin/gh/slayton58/2/orig 2025-09-07T07:51:36.4035337Z * [new branch] gh/slayton58/3/base -> origin/gh/slayton58/3/base 2025-09-07T07:51:36.4037382Z * [new branch] gh/slayton58/3/head -> origin/gh/slayton58/3/head 2025-09-07T07:51:36.4038892Z * [new branch] gh/slayton58/3/orig -> origin/gh/slayton58/3/orig 2025-09-07T07:51:36.4041036Z * [new branch] gh/slayton58/4/base -> origin/gh/slayton58/4/base 2025-09-07T07:51:36.4042640Z * [new branch] gh/slayton58/4/head -> origin/gh/slayton58/4/head 2025-09-07T07:51:36.4044216Z * [new branch] gh/slayton58/4/orig -> origin/gh/slayton58/4/orig 2025-09-07T07:51:36.4046967Z * [new branch] gh/slayton58/5/base -> origin/gh/slayton58/5/base 2025-09-07T07:51:36.4048506Z * [new branch] gh/slayton58/5/head -> origin/gh/slayton58/5/head 2025-09-07T07:51:36.4050057Z * [new branch] gh/slayton58/5/orig -> origin/gh/slayton58/5/orig 2025-09-07T07:51:36.4053091Z * [new branch] gh/soulitzer/269/base -> origin/gh/soulitzer/269/base 2025-09-07T07:51:36.4054631Z * [new branch] gh/soulitzer/269/head -> origin/gh/soulitzer/269/head 2025-09-07T07:51:36.4056654Z * [new branch] gh/soulitzer/269/orig -> origin/gh/soulitzer/269/orig 2025-09-07T07:51:36.4059048Z * [new branch] gh/soulitzer/276/base -> origin/gh/soulitzer/276/base 2025-09-07T07:51:36.4060800Z * [new branch] gh/soulitzer/276/head -> origin/gh/soulitzer/276/head 2025-09-07T07:51:36.4062491Z * [new branch] gh/soulitzer/276/orig -> origin/gh/soulitzer/276/orig 2025-09-07T07:51:36.4065159Z * [new branch] gh/soulitzer/287/base -> origin/gh/soulitzer/287/base 2025-09-07T07:51:36.4066882Z * [new branch] gh/soulitzer/287/head -> origin/gh/soulitzer/287/head 2025-09-07T07:51:36.4068538Z * [new branch] gh/soulitzer/287/orig -> origin/gh/soulitzer/287/orig 2025-09-07T07:51:36.4070830Z * [new branch] gh/soulitzer/296/base -> origin/gh/soulitzer/296/base 2025-09-07T07:51:36.4072467Z * [new branch] gh/soulitzer/296/head -> origin/gh/soulitzer/296/head 2025-09-07T07:51:36.4074096Z * [new branch] gh/soulitzer/296/orig -> origin/gh/soulitzer/296/orig 2025-09-07T07:51:36.4076847Z * [new branch] gh/soulitzer/299/base -> origin/gh/soulitzer/299/base 2025-09-07T07:51:36.4078406Z * [new branch] gh/soulitzer/299/head -> origin/gh/soulitzer/299/head 2025-09-07T07:51:36.4079964Z * [new branch] gh/soulitzer/299/orig -> origin/gh/soulitzer/299/orig 2025-09-07T07:51:36.4082197Z * [new branch] gh/soulitzer/300/base -> origin/gh/soulitzer/300/base 2025-09-07T07:51:36.4083862Z * [new branch] gh/soulitzer/300/head -> origin/gh/soulitzer/300/head 2025-09-07T07:51:36.4085549Z * [new branch] gh/soulitzer/300/orig -> origin/gh/soulitzer/300/orig 2025-09-07T07:51:36.4088019Z * [new branch] gh/soulitzer/301/base -> origin/gh/soulitzer/301/base 2025-09-07T07:51:36.4089628Z * [new branch] gh/soulitzer/301/head -> origin/gh/soulitzer/301/head 2025-09-07T07:51:36.4091193Z * [new branch] gh/soulitzer/301/orig -> origin/gh/soulitzer/301/orig 2025-09-07T07:51:36.4093372Z * [new branch] gh/soulitzer/313/base -> origin/gh/soulitzer/313/base 2025-09-07T07:51:36.4095139Z * [new branch] gh/soulitzer/313/head -> origin/gh/soulitzer/313/head 2025-09-07T07:51:36.4096850Z * [new branch] gh/soulitzer/313/orig -> origin/gh/soulitzer/313/orig 2025-09-07T07:51:36.4099082Z * [new branch] gh/soulitzer/319/base -> origin/gh/soulitzer/319/base 2025-09-07T07:51:36.4100803Z * [new branch] gh/soulitzer/319/head -> origin/gh/soulitzer/319/head 2025-09-07T07:51:36.4102459Z * [new branch] gh/soulitzer/319/orig -> origin/gh/soulitzer/319/orig 2025-09-07T07:51:36.4105177Z * [new branch] gh/soulitzer/320/base -> origin/gh/soulitzer/320/base 2025-09-07T07:51:36.4106894Z * [new branch] gh/soulitzer/320/head -> origin/gh/soulitzer/320/head 2025-09-07T07:51:36.4108287Z * [new branch] gh/soulitzer/320/orig -> origin/gh/soulitzer/320/orig 2025-09-07T07:51:36.4110699Z * [new branch] gh/soulitzer/336/base -> origin/gh/soulitzer/336/base 2025-09-07T07:51:36.4112224Z * [new branch] gh/soulitzer/336/head -> origin/gh/soulitzer/336/head 2025-09-07T07:51:36.4113800Z * [new branch] gh/soulitzer/336/orig -> origin/gh/soulitzer/336/orig 2025-09-07T07:51:36.4116516Z * [new branch] gh/soulitzer/347/base -> origin/gh/soulitzer/347/base 2025-09-07T07:51:36.4118127Z * [new branch] gh/soulitzer/347/head -> origin/gh/soulitzer/347/head 2025-09-07T07:51:36.4119690Z * [new branch] gh/soulitzer/347/orig -> origin/gh/soulitzer/347/orig 2025-09-07T07:51:36.4122060Z * [new branch] gh/soulitzer/349/base -> origin/gh/soulitzer/349/base 2025-09-07T07:51:36.4123734Z * [new branch] gh/soulitzer/349/head -> origin/gh/soulitzer/349/head 2025-09-07T07:51:36.4125451Z * [new branch] gh/soulitzer/349/orig -> origin/gh/soulitzer/349/orig 2025-09-07T07:51:36.4127690Z * [new branch] gh/soulitzer/350/base -> origin/gh/soulitzer/350/base 2025-09-07T07:51:36.4129211Z * [new branch] gh/soulitzer/350/head -> origin/gh/soulitzer/350/head 2025-09-07T07:51:36.4130770Z * [new branch] gh/soulitzer/350/orig -> origin/gh/soulitzer/350/orig 2025-09-07T07:51:36.4133239Z * [new branch] gh/soulitzer/351/base -> origin/gh/soulitzer/351/base 2025-09-07T07:51:36.4134887Z * [new branch] gh/soulitzer/351/head -> origin/gh/soulitzer/351/head 2025-09-07T07:51:36.4136798Z * [new branch] gh/soulitzer/351/orig -> origin/gh/soulitzer/351/orig 2025-09-07T07:51:36.4139054Z * [new branch] gh/soulitzer/353/base -> origin/gh/soulitzer/353/base 2025-09-07T07:51:36.4140734Z * [new branch] gh/soulitzer/353/head -> origin/gh/soulitzer/353/head 2025-09-07T07:51:36.4142700Z * [new branch] gh/soulitzer/353/orig -> origin/gh/soulitzer/353/orig 2025-09-07T07:51:36.4145470Z * [new branch] gh/soulitzer/358/base -> origin/gh/soulitzer/358/base 2025-09-07T07:51:36.4147147Z * [new branch] gh/soulitzer/358/head -> origin/gh/soulitzer/358/head 2025-09-07T07:51:36.4148693Z * [new branch] gh/soulitzer/358/orig -> origin/gh/soulitzer/358/orig 2025-09-07T07:51:36.4151450Z * [new branch] gh/soulitzer/359/base -> origin/gh/soulitzer/359/base 2025-09-07T07:51:36.4153048Z * [new branch] gh/soulitzer/359/head -> origin/gh/soulitzer/359/head 2025-09-07T07:51:36.4154726Z * [new branch] gh/soulitzer/359/orig -> origin/gh/soulitzer/359/orig 2025-09-07T07:51:36.4157443Z * [new branch] gh/soulitzer/362/base -> origin/gh/soulitzer/362/base 2025-09-07T07:51:36.4159059Z * [new branch] gh/soulitzer/362/head -> origin/gh/soulitzer/362/head 2025-09-07T07:51:36.4160621Z * [new branch] gh/soulitzer/362/orig -> origin/gh/soulitzer/362/orig 2025-09-07T07:51:36.4162800Z * [new branch] gh/soulitzer/372/base -> origin/gh/soulitzer/372/base 2025-09-07T07:51:36.4164451Z * [new branch] gh/soulitzer/372/head -> origin/gh/soulitzer/372/head 2025-09-07T07:51:36.4166298Z * [new branch] gh/soulitzer/372/orig -> origin/gh/soulitzer/372/orig 2025-09-07T07:51:36.4168623Z * [new branch] gh/soulitzer/373/base -> origin/gh/soulitzer/373/base 2025-09-07T07:51:36.4170392Z * [new branch] gh/soulitzer/373/head -> origin/gh/soulitzer/373/head 2025-09-07T07:51:36.4171790Z * [new branch] gh/soulitzer/373/orig -> origin/gh/soulitzer/373/orig 2025-09-07T07:51:36.4174119Z * [new branch] gh/soulitzer/374/base -> origin/gh/soulitzer/374/base 2025-09-07T07:51:36.4175992Z * [new branch] gh/soulitzer/374/head -> origin/gh/soulitzer/374/head 2025-09-07T07:51:36.4177576Z * [new branch] gh/soulitzer/374/orig -> origin/gh/soulitzer/374/orig 2025-09-07T07:51:36.4179896Z * [new branch] gh/soulitzer/375/base -> origin/gh/soulitzer/375/base 2025-09-07T07:51:36.4181410Z * [new branch] gh/soulitzer/375/head -> origin/gh/soulitzer/375/head 2025-09-07T07:51:36.4183130Z * [new branch] gh/soulitzer/375/orig -> origin/gh/soulitzer/375/orig 2025-09-07T07:51:36.4185514Z * [new branch] gh/soulitzer/376/base -> origin/gh/soulitzer/376/base 2025-09-07T07:51:36.4187226Z * [new branch] gh/soulitzer/376/head -> origin/gh/soulitzer/376/head 2025-09-07T07:51:36.4188746Z * [new branch] gh/soulitzer/376/orig -> origin/gh/soulitzer/376/orig 2025-09-07T07:51:36.4191066Z * [new branch] gh/soulitzer/377/base -> origin/gh/soulitzer/377/base 2025-09-07T07:51:36.4192645Z * [new branch] gh/soulitzer/377/head -> origin/gh/soulitzer/377/head 2025-09-07T07:51:36.4194182Z * [new branch] gh/soulitzer/377/orig -> origin/gh/soulitzer/377/orig 2025-09-07T07:51:36.4196797Z * [new branch] gh/soulitzer/378/base -> origin/gh/soulitzer/378/base 2025-09-07T07:51:36.4198365Z * [new branch] gh/soulitzer/378/head -> origin/gh/soulitzer/378/head 2025-09-07T07:51:36.4199887Z * [new branch] gh/soulitzer/378/orig -> origin/gh/soulitzer/378/orig 2025-09-07T07:51:36.4202121Z * [new branch] gh/soulitzer/379/base -> origin/gh/soulitzer/379/base 2025-09-07T07:51:36.4203708Z * [new branch] gh/soulitzer/379/head -> origin/gh/soulitzer/379/head 2025-09-07T07:51:36.4205318Z * [new branch] gh/soulitzer/379/orig -> origin/gh/soulitzer/379/orig 2025-09-07T07:51:36.4208418Z * [new branch] gh/swolchok/728/next -> origin/gh/swolchok/728/next 2025-09-07T07:51:36.4210979Z * [new branch] gh/swolchok/767/base -> origin/gh/swolchok/767/base 2025-09-07T07:51:36.4212774Z * [new branch] gh/swolchok/767/head -> origin/gh/swolchok/767/head 2025-09-07T07:51:36.4214485Z * [new branch] gh/swolchok/767/orig -> origin/gh/swolchok/767/orig 2025-09-07T07:51:36.4217075Z * [new branch] gh/swolchok/768/base -> origin/gh/swolchok/768/base 2025-09-07T07:51:36.4218661Z * [new branch] gh/swolchok/768/head -> origin/gh/swolchok/768/head 2025-09-07T07:51:36.4220294Z * [new branch] gh/swolchok/768/orig -> origin/gh/swolchok/768/orig 2025-09-07T07:51:36.4222869Z * [new branch] gh/swolchok/769/base -> origin/gh/swolchok/769/base 2025-09-07T07:51:36.4224439Z * [new branch] gh/swolchok/769/head -> origin/gh/swolchok/769/head 2025-09-07T07:51:36.4226504Z * [new branch] gh/swolchok/769/orig -> origin/gh/swolchok/769/orig 2025-09-07T07:51:36.4228836Z * [new branch] gh/swolchok/771/base -> origin/gh/swolchok/771/base 2025-09-07T07:51:36.4230511Z * [new branch] gh/swolchok/771/head -> origin/gh/swolchok/771/head 2025-09-07T07:51:36.4232101Z * [new branch] gh/swolchok/771/orig -> origin/gh/swolchok/771/orig 2025-09-07T07:51:36.4234348Z * [new branch] gh/swolchok/772/base -> origin/gh/swolchok/772/base 2025-09-07T07:51:36.4236255Z * [new branch] gh/swolchok/772/head -> origin/gh/swolchok/772/head 2025-09-07T07:51:36.4237977Z * [new branch] gh/swolchok/772/orig -> origin/gh/swolchok/772/orig 2025-09-07T07:51:36.4240380Z * [new branch] gh/swolchok/773/base -> origin/gh/swolchok/773/base 2025-09-07T07:51:36.4242019Z * [new branch] gh/swolchok/773/head -> origin/gh/swolchok/773/head 2025-09-07T07:51:36.4243611Z * [new branch] gh/swolchok/773/orig -> origin/gh/swolchok/773/orig 2025-09-07T07:51:36.4246118Z * [new branch] gh/swolchok/786/base -> origin/gh/swolchok/786/base 2025-09-07T07:51:36.4247684Z * [new branch] gh/swolchok/786/head -> origin/gh/swolchok/786/head 2025-09-07T07:51:36.4249502Z * [new branch] gh/swolchok/786/orig -> origin/gh/swolchok/786/orig 2025-09-07T07:51:36.4251612Z * [new branch] gh/swolchok/787/base -> origin/gh/swolchok/787/base 2025-09-07T07:51:36.4253241Z * [new branch] gh/swolchok/787/head -> origin/gh/swolchok/787/head 2025-09-07T07:51:36.4254749Z * [new branch] gh/swolchok/787/orig -> origin/gh/swolchok/787/orig 2025-09-07T07:51:36.4257335Z * [new branch] gh/swolchok/788/base -> origin/gh/swolchok/788/base 2025-09-07T07:51:36.4258868Z * [new branch] gh/swolchok/788/head -> origin/gh/swolchok/788/head 2025-09-07T07:51:36.4260502Z * [new branch] gh/swolchok/788/orig -> origin/gh/swolchok/788/orig 2025-09-07T07:51:36.4262849Z * [new branch] gh/swolchok/789/base -> origin/gh/swolchok/789/base 2025-09-07T07:51:36.4264434Z * [new branch] gh/swolchok/789/head -> origin/gh/swolchok/789/head 2025-09-07T07:51:36.4266255Z * [new branch] gh/swolchok/789/orig -> origin/gh/swolchok/789/orig 2025-09-07T07:51:36.4268743Z * [new branch] gh/swolchok/790/base -> origin/gh/swolchok/790/base 2025-09-07T07:51:36.4270597Z * [new branch] gh/swolchok/790/head -> origin/gh/swolchok/790/head 2025-09-07T07:51:36.4272159Z * [new branch] gh/swolchok/790/orig -> origin/gh/swolchok/790/orig 2025-09-07T07:51:36.4274548Z * [new branch] gh/swolchok/791/base -> origin/gh/swolchok/791/base 2025-09-07T07:51:36.4276393Z * [new branch] gh/swolchok/791/head -> origin/gh/swolchok/791/head 2025-09-07T07:51:36.4277995Z * [new branch] gh/swolchok/791/orig -> origin/gh/swolchok/791/orig 2025-09-07T07:51:36.4280256Z * [new branch] gh/swolchok/792/base -> origin/gh/swolchok/792/base 2025-09-07T07:51:36.4281760Z * [new branch] gh/swolchok/792/head -> origin/gh/swolchok/792/head 2025-09-07T07:51:36.4283238Z * [new branch] gh/swolchok/792/orig -> origin/gh/swolchok/792/orig 2025-09-07T07:51:36.4285747Z * [new branch] gh/swolchok/793/base -> origin/gh/swolchok/793/base 2025-09-07T07:51:36.4287395Z * [new branch] gh/swolchok/793/head -> origin/gh/swolchok/793/head 2025-09-07T07:51:36.4289694Z * [new branch] gh/swolchok/793/orig -> origin/gh/swolchok/793/orig 2025-09-07T07:51:36.4291991Z * [new branch] gh/swolchok/794/base -> origin/gh/swolchok/794/base 2025-09-07T07:51:36.4293525Z * [new branch] gh/swolchok/794/head -> origin/gh/swolchok/794/head 2025-09-07T07:51:36.4295122Z * [new branch] gh/swolchok/794/orig -> origin/gh/swolchok/794/orig 2025-09-07T07:51:36.4297695Z * [new branch] gh/swolchok/795/base -> origin/gh/swolchok/795/base 2025-09-07T07:51:36.4299599Z * [new branch] gh/swolchok/795/head -> origin/gh/swolchok/795/head 2025-09-07T07:51:36.4301165Z * [new branch] gh/swolchok/795/orig -> origin/gh/swolchok/795/orig 2025-09-07T07:51:36.4303542Z * [new branch] gh/swolchok/796/base -> origin/gh/swolchok/796/base 2025-09-07T07:51:36.4305464Z * [new branch] gh/swolchok/796/head -> origin/gh/swolchok/796/head 2025-09-07T07:51:36.4307105Z * [new branch] gh/swolchok/796/orig -> origin/gh/swolchok/796/orig 2025-09-07T07:51:36.4309486Z * [new branch] gh/swolchok/797/base -> origin/gh/swolchok/797/base 2025-09-07T07:51:36.4311198Z * [new branch] gh/swolchok/797/head -> origin/gh/swolchok/797/head 2025-09-07T07:51:36.4312768Z * [new branch] gh/swolchok/797/orig -> origin/gh/swolchok/797/orig 2025-09-07T07:51:36.4315258Z * [new branch] gh/swolchok/798/base -> origin/gh/swolchok/798/base 2025-09-07T07:51:36.4316963Z * [new branch] gh/swolchok/798/head -> origin/gh/swolchok/798/head 2025-09-07T07:51:36.4318767Z * [new branch] gh/swolchok/798/orig -> origin/gh/swolchok/798/orig 2025-09-07T07:51:36.4321235Z * [new branch] gh/swolchok/799/base -> origin/gh/swolchok/799/base 2025-09-07T07:51:36.4322828Z * [new branch] gh/swolchok/799/head -> origin/gh/swolchok/799/head 2025-09-07T07:51:36.4324569Z * [new branch] gh/swolchok/799/orig -> origin/gh/swolchok/799/orig 2025-09-07T07:51:36.4327448Z * [new branch] gh/swolchok/800/base -> origin/gh/swolchok/800/base 2025-09-07T07:51:36.4329018Z * [new branch] gh/swolchok/800/head -> origin/gh/swolchok/800/head 2025-09-07T07:51:36.4330677Z * [new branch] gh/swolchok/800/orig -> origin/gh/swolchok/800/orig 2025-09-07T07:51:36.4332970Z * [new branch] gh/swolchok/801/base -> origin/gh/swolchok/801/base 2025-09-07T07:51:36.4334553Z * [new branch] gh/swolchok/801/head -> origin/gh/swolchok/801/head 2025-09-07T07:51:36.4336731Z * [new branch] gh/swolchok/801/orig -> origin/gh/swolchok/801/orig 2025-09-07T07:51:36.4339035Z * [new branch] gh/swolchok/802/base -> origin/gh/swolchok/802/base 2025-09-07T07:51:36.4340587Z * [new branch] gh/swolchok/802/head -> origin/gh/swolchok/802/head 2025-09-07T07:51:36.4342311Z * [new branch] gh/swolchok/802/orig -> origin/gh/swolchok/802/orig 2025-09-07T07:51:36.4344546Z * [new branch] gh/swolchok/803/base -> origin/gh/swolchok/803/base 2025-09-07T07:51:36.4346373Z * [new branch] gh/swolchok/803/head -> origin/gh/swolchok/803/head 2025-09-07T07:51:36.4348264Z * [new branch] gh/swolchok/803/orig -> origin/gh/swolchok/803/orig 2025-09-07T07:51:36.4350540Z * [new branch] gh/swolchok/804/base -> origin/gh/swolchok/804/base 2025-09-07T07:51:36.4352079Z * [new branch] gh/swolchok/804/head -> origin/gh/swolchok/804/head 2025-09-07T07:51:36.4353750Z * [new branch] gh/swolchok/804/orig -> origin/gh/swolchok/804/orig 2025-09-07T07:51:36.4356445Z * [new branch] gh/swolchok/805/base -> origin/gh/swolchok/805/base 2025-09-07T07:51:36.4357979Z * [new branch] gh/swolchok/805/head -> origin/gh/swolchok/805/head 2025-09-07T07:51:36.4359552Z * [new branch] gh/swolchok/805/orig -> origin/gh/swolchok/805/orig 2025-09-07T07:51:36.4361655Z * [new branch] gh/swolchok/806/base -> origin/gh/swolchok/806/base 2025-09-07T07:51:36.4363300Z * [new branch] gh/swolchok/806/head -> origin/gh/swolchok/806/head 2025-09-07T07:51:36.4364816Z * [new branch] gh/swolchok/806/orig -> origin/gh/swolchok/806/orig 2025-09-07T07:51:36.4367920Z * [new branch] gh/swolchok/807/base -> origin/gh/swolchok/807/base 2025-09-07T07:51:36.4369500Z * [new branch] gh/swolchok/807/head -> origin/gh/swolchok/807/head 2025-09-07T07:51:36.4371129Z * [new branch] gh/swolchok/807/orig -> origin/gh/swolchok/807/orig 2025-09-07T07:51:36.4373554Z * [new branch] gh/swolchok/808/base -> origin/gh/swolchok/808/base 2025-09-07T07:51:36.4375205Z * [new branch] gh/swolchok/808/head -> origin/gh/swolchok/808/head 2025-09-07T07:51:36.4376872Z * [new branch] gh/swolchok/808/orig -> origin/gh/swolchok/808/orig 2025-09-07T07:51:36.4379209Z * [new branch] gh/swolchok/809/base -> origin/gh/swolchok/809/base 2025-09-07T07:51:36.4380768Z * [new branch] gh/swolchok/809/head -> origin/gh/swolchok/809/head 2025-09-07T07:51:36.4382474Z * [new branch] gh/swolchok/809/orig -> origin/gh/swolchok/809/orig 2025-09-07T07:51:36.4384887Z * [new branch] gh/swolchok/810/base -> origin/gh/swolchok/810/base 2025-09-07T07:51:36.4387075Z * [new branch] gh/swolchok/810/head -> origin/gh/swolchok/810/head 2025-09-07T07:51:36.4388629Z * [new branch] gh/swolchok/810/orig -> origin/gh/swolchok/810/orig 2025-09-07T07:51:36.4391304Z * [new branch] gh/swolchok/811/base -> origin/gh/swolchok/811/base 2025-09-07T07:51:36.4392829Z * [new branch] gh/swolchok/811/head -> origin/gh/swolchok/811/head 2025-09-07T07:51:36.4394479Z * [new branch] gh/swolchok/811/orig -> origin/gh/swolchok/811/orig 2025-09-07T07:51:36.4397143Z * [new branch] gh/swolchok/812/base -> origin/gh/swolchok/812/base 2025-09-07T07:51:36.4398883Z * [new branch] gh/swolchok/812/head -> origin/gh/swolchok/812/head 2025-09-07T07:51:36.4400379Z * [new branch] gh/swolchok/812/orig -> origin/gh/swolchok/812/orig 2025-09-07T07:51:36.4402778Z * [new branch] gh/swolchok/813/base -> origin/gh/swolchok/813/base 2025-09-07T07:51:36.4404329Z * [new branch] gh/swolchok/813/head -> origin/gh/swolchok/813/head 2025-09-07T07:51:36.4406267Z * [new branch] gh/swolchok/813/orig -> origin/gh/swolchok/813/orig 2025-09-07T07:51:36.4408603Z * [new branch] gh/swolchok/814/base -> origin/gh/swolchok/814/base 2025-09-07T07:51:36.4410118Z * [new branch] gh/swolchok/814/head -> origin/gh/swolchok/814/head 2025-09-07T07:51:36.4411668Z * [new branch] gh/swolchok/814/orig -> origin/gh/swolchok/814/orig 2025-09-07T07:51:36.4413942Z * [new branch] gh/swolchok/815/base -> origin/gh/swolchok/815/base 2025-09-07T07:51:36.4415616Z * [new branch] gh/swolchok/815/head -> origin/gh/swolchok/815/head 2025-09-07T07:51:36.4417251Z * [new branch] gh/swolchok/815/orig -> origin/gh/swolchok/815/orig 2025-09-07T07:51:36.4419523Z * [new branch] gh/swolchok/816/base -> origin/gh/swolchok/816/base 2025-09-07T07:51:36.4421116Z * [new branch] gh/swolchok/816/head -> origin/gh/swolchok/816/head 2025-09-07T07:51:36.4422797Z * [new branch] gh/swolchok/816/orig -> origin/gh/swolchok/816/orig 2025-09-07T07:51:36.4425357Z * [new branch] gh/swolchok/817/base -> origin/gh/swolchok/817/base 2025-09-07T07:51:36.4426949Z * [new branch] gh/swolchok/817/head -> origin/gh/swolchok/817/head 2025-09-07T07:51:36.4428454Z * [new branch] gh/swolchok/817/orig -> origin/gh/swolchok/817/orig 2025-09-07T07:51:36.4430811Z * [new branch] gh/swolchok/818/base -> origin/gh/swolchok/818/base 2025-09-07T07:51:36.4432498Z * [new branch] gh/swolchok/818/head -> origin/gh/swolchok/818/head 2025-09-07T07:51:36.4434062Z * [new branch] gh/swolchok/818/orig -> origin/gh/swolchok/818/orig 2025-09-07T07:51:36.4436709Z * [new branch] gh/swolchok/819/base -> origin/gh/swolchok/819/base 2025-09-07T07:51:36.4438394Z * [new branch] gh/swolchok/819/head -> origin/gh/swolchok/819/head 2025-09-07T07:51:36.4440151Z * [new branch] gh/swolchok/819/orig -> origin/gh/swolchok/819/orig 2025-09-07T07:51:36.4442272Z * [new branch] gh/swolchok/820/base -> origin/gh/swolchok/820/base 2025-09-07T07:51:36.4443950Z * [new branch] gh/swolchok/820/head -> origin/gh/swolchok/820/head 2025-09-07T07:51:36.4445774Z * [new branch] gh/swolchok/820/orig -> origin/gh/swolchok/820/orig 2025-09-07T07:51:36.4448090Z * [new branch] gh/swolchok/821/base -> origin/gh/swolchok/821/base 2025-09-07T07:51:36.4449583Z * [new branch] gh/swolchok/821/head -> origin/gh/swolchok/821/head 2025-09-07T07:51:36.4451129Z * [new branch] gh/swolchok/821/orig -> origin/gh/swolchok/821/orig 2025-09-07T07:51:36.4453482Z * [new branch] gh/swolchok/822/base -> origin/gh/swolchok/822/base 2025-09-07T07:51:36.4455115Z * [new branch] gh/swolchok/822/head -> origin/gh/swolchok/822/head 2025-09-07T07:51:36.4456853Z * [new branch] gh/swolchok/822/orig -> origin/gh/swolchok/822/orig 2025-09-07T07:51:36.4459056Z * [new branch] gh/swolchok/823/base -> origin/gh/swolchok/823/base 2025-09-07T07:51:36.4460578Z * [new branch] gh/swolchok/823/head -> origin/gh/swolchok/823/head 2025-09-07T07:51:36.4462156Z * [new branch] gh/swolchok/823/orig -> origin/gh/swolchok/823/orig 2025-09-07T07:51:36.4464509Z * [new branch] gh/swolchok/824/base -> origin/gh/swolchok/824/base 2025-09-07T07:51:36.4466334Z * [new branch] gh/swolchok/824/head -> origin/gh/swolchok/824/head 2025-09-07T07:51:36.4468006Z * [new branch] gh/swolchok/824/orig -> origin/gh/swolchok/824/orig 2025-09-07T07:51:36.4470448Z * [new branch] gh/swolchok/825/base -> origin/gh/swolchok/825/base 2025-09-07T07:51:36.4472069Z * [new branch] gh/swolchok/825/head -> origin/gh/swolchok/825/head 2025-09-07T07:51:36.4473642Z * [new branch] gh/swolchok/825/orig -> origin/gh/swolchok/825/orig 2025-09-07T07:51:36.4476175Z * [new branch] gh/swolchok/826/base -> origin/gh/swolchok/826/base 2025-09-07T07:51:36.4477648Z * [new branch] gh/swolchok/826/head -> origin/gh/swolchok/826/head 2025-09-07T07:51:36.4479068Z * [new branch] gh/swolchok/826/orig -> origin/gh/swolchok/826/orig 2025-09-07T07:51:36.4481476Z * [new branch] gh/swolchok/827/base -> origin/gh/swolchok/827/base 2025-09-07T07:51:36.4483008Z * [new branch] gh/swolchok/827/head -> origin/gh/swolchok/827/head 2025-09-07T07:51:36.4484462Z * [new branch] gh/swolchok/827/orig -> origin/gh/swolchok/827/orig 2025-09-07T07:51:36.4487206Z * [new branch] gh/swolchok/828/base -> origin/gh/swolchok/828/base 2025-09-07T07:51:36.4488957Z * [new branch] gh/swolchok/828/head -> origin/gh/swolchok/828/head 2025-09-07T07:51:36.4490448Z * [new branch] gh/swolchok/828/orig -> origin/gh/swolchok/828/orig 2025-09-07T07:51:36.4492539Z * [new branch] gh/swolchok/829/base -> origin/gh/swolchok/829/base 2025-09-07T07:51:36.4494120Z * [new branch] gh/swolchok/829/head -> origin/gh/swolchok/829/head 2025-09-07T07:51:36.4495979Z * [new branch] gh/swolchok/829/orig -> origin/gh/swolchok/829/orig 2025-09-07T07:51:36.4498289Z * [new branch] gh/swolchok/830/base -> origin/gh/swolchok/830/base 2025-09-07T07:51:36.4499782Z * [new branch] gh/swolchok/830/head -> origin/gh/swolchok/830/head 2025-09-07T07:51:36.4501301Z * [new branch] gh/swolchok/830/orig -> origin/gh/swolchok/830/orig 2025-09-07T07:51:36.4503536Z * [new branch] gh/swolchok/831/base -> origin/gh/swolchok/831/base 2025-09-07T07:51:36.4505433Z * [new branch] gh/swolchok/831/head -> origin/gh/swolchok/831/head 2025-09-07T07:51:36.4507031Z * [new branch] gh/swolchok/831/orig -> origin/gh/swolchok/831/orig 2025-09-07T07:51:36.4509299Z * [new branch] gh/swolchok/832/base -> origin/gh/swolchok/832/base 2025-09-07T07:51:36.4511122Z * [new branch] gh/swolchok/832/head -> origin/gh/swolchok/832/head 2025-09-07T07:51:36.4512612Z * [new branch] gh/swolchok/832/orig -> origin/gh/swolchok/832/orig 2025-09-07T07:51:36.4515472Z * [new branch] gh/syed-ahmed/3/base -> origin/gh/syed-ahmed/3/base 2025-09-07T07:51:36.4517091Z * [new branch] gh/syed-ahmed/3/head -> origin/gh/syed-ahmed/3/head 2025-09-07T07:51:36.4518679Z * [new branch] gh/syed-ahmed/3/orig -> origin/gh/syed-ahmed/3/orig 2025-09-07T07:51:36.4520828Z * [new branch] gh/syed-ahmed/4/base -> origin/gh/syed-ahmed/4/base 2025-09-07T07:51:36.4522353Z * [new branch] gh/syed-ahmed/4/head -> origin/gh/syed-ahmed/4/head 2025-09-07T07:51:36.4523860Z * [new branch] gh/syed-ahmed/4/orig -> origin/gh/syed-ahmed/4/orig 2025-09-07T07:51:36.4526485Z * [new branch] gh/syed-ahmed/5/base -> origin/gh/syed-ahmed/5/base 2025-09-07T07:51:36.4527976Z * [new branch] gh/syed-ahmed/5/head -> origin/gh/syed-ahmed/5/head 2025-09-07T07:51:36.4529542Z * [new branch] gh/syed-ahmed/5/orig -> origin/gh/syed-ahmed/5/orig 2025-09-07T07:51:36.4532425Z * [new branch] gh/teja-rao/4/base -> origin/gh/teja-rao/4/base 2025-09-07T07:51:36.4534038Z * [new branch] gh/teja-rao/4/head -> origin/gh/teja-rao/4/head 2025-09-07T07:51:36.4535819Z * [new branch] gh/teja-rao/4/orig -> origin/gh/teja-rao/4/orig 2025-09-07T07:51:36.4538796Z * [new branch] gh/tianyu-l/2/base -> origin/gh/tianyu-l/2/base 2025-09-07T07:51:36.4540422Z * [new branch] gh/tianyu-l/2/head -> origin/gh/tianyu-l/2/head 2025-09-07T07:51:36.4542084Z * [new branch] gh/tianyu-l/2/orig -> origin/gh/tianyu-l/2/orig 2025-09-07T07:51:36.4544231Z * [new branch] gh/tianyu-l/3/base -> origin/gh/tianyu-l/3/base 2025-09-07T07:51:36.4546006Z * [new branch] gh/tianyu-l/3/head -> origin/gh/tianyu-l/3/head 2025-09-07T07:51:36.4547549Z * [new branch] gh/tianyu-l/3/orig -> origin/gh/tianyu-l/3/orig 2025-09-07T07:51:36.4550040Z * [new branch] gh/tianyu-l/4/base -> origin/gh/tianyu-l/4/base 2025-09-07T07:51:36.4551639Z * [new branch] gh/tianyu-l/4/head -> origin/gh/tianyu-l/4/head 2025-09-07T07:51:36.4553133Z * [new branch] gh/tianyu-l/4/orig -> origin/gh/tianyu-l/4/orig 2025-09-07T07:51:36.4556413Z * [new branch] gh/tugsbayasgalan/1/base -> origin/gh/tugsbayasgalan/1/base 2025-09-07T07:51:36.4557919Z * [new branch] gh/tugsbayasgalan/1/head -> origin/gh/tugsbayasgalan/1/head 2025-09-07T07:51:36.4559493Z * [new branch] gh/tugsbayasgalan/1/orig -> origin/gh/tugsbayasgalan/1/orig 2025-09-07T07:51:36.4561961Z * [new branch] gh/tugsbayasgalan/10/base -> origin/gh/tugsbayasgalan/10/base 2025-09-07T07:51:36.4563583Z * [new branch] gh/tugsbayasgalan/10/head -> origin/gh/tugsbayasgalan/10/head 2025-09-07T07:51:36.4565243Z * [new branch] gh/tugsbayasgalan/10/orig -> origin/gh/tugsbayasgalan/10/orig 2025-09-07T07:51:36.4567404Z * [new branch] gh/tugsbayasgalan/11/base -> origin/gh/tugsbayasgalan/11/base 2025-09-07T07:51:36.4569058Z * [new branch] gh/tugsbayasgalan/11/head -> origin/gh/tugsbayasgalan/11/head 2025-09-07T07:51:36.4570530Z * [new branch] gh/tugsbayasgalan/11/orig -> origin/gh/tugsbayasgalan/11/orig 2025-09-07T07:51:36.4573071Z * [new branch] gh/tugsbayasgalan/12/base -> origin/gh/tugsbayasgalan/12/base 2025-09-07T07:51:36.4574474Z * [new branch] gh/tugsbayasgalan/12/head -> origin/gh/tugsbayasgalan/12/head 2025-09-07T07:51:36.4576246Z * [new branch] gh/tugsbayasgalan/12/orig -> origin/gh/tugsbayasgalan/12/orig 2025-09-07T07:51:36.4578433Z * [new branch] gh/tugsbayasgalan/13/base -> origin/gh/tugsbayasgalan/13/base 2025-09-07T07:51:36.4580017Z * [new branch] gh/tugsbayasgalan/13/head -> origin/gh/tugsbayasgalan/13/head 2025-09-07T07:51:36.4581569Z * [new branch] gh/tugsbayasgalan/13/orig -> origin/gh/tugsbayasgalan/13/orig 2025-09-07T07:51:36.4583890Z * [new branch] gh/tugsbayasgalan/14/base -> origin/gh/tugsbayasgalan/14/base 2025-09-07T07:51:36.4585526Z * [new branch] gh/tugsbayasgalan/14/head -> origin/gh/tugsbayasgalan/14/head 2025-09-07T07:51:36.4587188Z * [new branch] gh/tugsbayasgalan/14/orig -> origin/gh/tugsbayasgalan/14/orig 2025-09-07T07:51:36.4589832Z * [new branch] gh/tugsbayasgalan/15/base -> origin/gh/tugsbayasgalan/15/base 2025-09-07T07:51:36.4591492Z * [new branch] gh/tugsbayasgalan/15/head -> origin/gh/tugsbayasgalan/15/head 2025-09-07T07:51:36.4593023Z * [new branch] gh/tugsbayasgalan/15/orig -> origin/gh/tugsbayasgalan/15/orig 2025-09-07T07:51:36.4595508Z * [new branch] gh/tugsbayasgalan/2/base -> origin/gh/tugsbayasgalan/2/base 2025-09-07T07:51:36.4597278Z * [new branch] gh/tugsbayasgalan/2/head -> origin/gh/tugsbayasgalan/2/head 2025-09-07T07:51:36.4598971Z * [new branch] gh/tugsbayasgalan/2/orig -> origin/gh/tugsbayasgalan/2/orig 2025-09-07T07:51:36.4601065Z * [new branch] gh/tugsbayasgalan/3/base -> origin/gh/tugsbayasgalan/3/base 2025-09-07T07:51:36.4602718Z * [new branch] gh/tugsbayasgalan/3/head -> origin/gh/tugsbayasgalan/3/head 2025-09-07T07:51:36.4604249Z * [new branch] gh/tugsbayasgalan/3/orig -> origin/gh/tugsbayasgalan/3/orig 2025-09-07T07:51:36.4606820Z * [new branch] gh/tugsbayasgalan/4/base -> origin/gh/tugsbayasgalan/4/base 2025-09-07T07:51:36.4608488Z * [new branch] gh/tugsbayasgalan/4/head -> origin/gh/tugsbayasgalan/4/head 2025-09-07T07:51:36.4610058Z * [new branch] gh/tugsbayasgalan/4/orig -> origin/gh/tugsbayasgalan/4/orig 2025-09-07T07:51:36.4612325Z * [new branch] gh/tugsbayasgalan/5/base -> origin/gh/tugsbayasgalan/5/base 2025-09-07T07:51:36.4613881Z * [new branch] gh/tugsbayasgalan/5/head -> origin/gh/tugsbayasgalan/5/head 2025-09-07T07:51:36.4615777Z * [new branch] gh/tugsbayasgalan/5/orig -> origin/gh/tugsbayasgalan/5/orig 2025-09-07T07:51:36.4617952Z * [new branch] gh/tugsbayasgalan/6/base -> origin/gh/tugsbayasgalan/6/base 2025-09-07T07:51:36.4619667Z * [new branch] gh/tugsbayasgalan/6/head -> origin/gh/tugsbayasgalan/6/head 2025-09-07T07:51:36.4621675Z * [new branch] gh/tugsbayasgalan/6/orig -> origin/gh/tugsbayasgalan/6/orig 2025-09-07T07:51:36.4623989Z * [new branch] gh/tugsbayasgalan/7/base -> origin/gh/tugsbayasgalan/7/base 2025-09-07T07:51:36.4625718Z * [new branch] gh/tugsbayasgalan/7/head -> origin/gh/tugsbayasgalan/7/head 2025-09-07T07:51:36.4627377Z * [new branch] gh/tugsbayasgalan/7/orig -> origin/gh/tugsbayasgalan/7/orig 2025-09-07T07:51:36.4629717Z * [new branch] gh/tugsbayasgalan/8/base -> origin/gh/tugsbayasgalan/8/base 2025-09-07T07:51:36.4631238Z * [new branch] gh/tugsbayasgalan/8/head -> origin/gh/tugsbayasgalan/8/head 2025-09-07T07:51:36.4632756Z * [new branch] gh/tugsbayasgalan/8/orig -> origin/gh/tugsbayasgalan/8/orig 2025-09-07T07:51:36.4634913Z * [new branch] gh/tugsbayasgalan/9/base -> origin/gh/tugsbayasgalan/9/base 2025-09-07T07:51:36.4636865Z * [new branch] gh/tugsbayasgalan/9/head -> origin/gh/tugsbayasgalan/9/head 2025-09-07T07:51:36.4638343Z * [new branch] gh/tugsbayasgalan/9/orig -> origin/gh/tugsbayasgalan/9/orig 2025-09-07T07:51:36.4641067Z * [new branch] gh/v0i0/1/base -> origin/gh/v0i0/1/base 2025-09-07T07:51:36.4642614Z * [new branch] gh/v0i0/1/head -> origin/gh/v0i0/1/head 2025-09-07T07:51:36.4644120Z * [new branch] gh/v0i0/1/orig -> origin/gh/v0i0/1/orig 2025-09-07T07:51:36.4646715Z * [new branch] gh/v0i0/4/base -> origin/gh/v0i0/4/base 2025-09-07T07:51:36.4648215Z * [new branch] gh/v0i0/4/head -> origin/gh/v0i0/4/head 2025-09-07T07:51:36.4649722Z * [new branch] gh/v0i0/4/orig -> origin/gh/v0i0/4/orig 2025-09-07T07:51:36.4651922Z * [new branch] gh/v0i0/6/base -> origin/gh/v0i0/6/base 2025-09-07T07:51:36.4653498Z * [new branch] gh/v0i0/6/head -> origin/gh/v0i0/6/head 2025-09-07T07:51:36.4655093Z * [new branch] gh/v0i0/6/orig -> origin/gh/v0i0/6/orig 2025-09-07T07:51:36.4657545Z * [new branch] gh/v0i0/7/base -> origin/gh/v0i0/7/base 2025-09-07T07:51:36.4659119Z * [new branch] gh/v0i0/7/head -> origin/gh/v0i0/7/head 2025-09-07T07:51:36.4660671Z * [new branch] gh/v0i0/7/orig -> origin/gh/v0i0/7/orig 2025-09-07T07:51:36.4662877Z * [new branch] gh/v0i0/8/base -> origin/gh/v0i0/8/base 2025-09-07T07:51:36.4664291Z * [new branch] gh/v0i0/8/head -> origin/gh/v0i0/8/head 2025-09-07T07:51:36.4666161Z * [new branch] gh/v0i0/8/orig -> origin/gh/v0i0/8/orig 2025-09-07T07:51:36.4668472Z * [new branch] gh/v0i0/9/base -> origin/gh/v0i0/9/base 2025-09-07T07:51:36.4669995Z * [new branch] gh/v0i0/9/head -> origin/gh/v0i0/9/head 2025-09-07T07:51:36.4671527Z * [new branch] gh/v0i0/9/orig -> origin/gh/v0i0/9/orig 2025-09-07T07:51:36.4674327Z * [new branch] gh/vkuzo/1/next -> origin/gh/vkuzo/1/next 2025-09-07T07:51:36.4676755Z * [new branch] gh/vkuzo/2/next -> origin/gh/vkuzo/2/next 2025-09-07T07:51:36.4679170Z * [new branch] gh/vkuzo/3/next -> origin/gh/vkuzo/3/next 2025-09-07T07:51:36.4681398Z * [new branch] gh/vkuzo/4/base -> origin/gh/vkuzo/4/base 2025-09-07T07:51:36.4683056Z * [new branch] gh/vkuzo/4/head -> origin/gh/vkuzo/4/head 2025-09-07T07:51:36.4684705Z * [new branch] gh/vkuzo/4/orig -> origin/gh/vkuzo/4/orig 2025-09-07T07:51:36.4687316Z * [new branch] gh/vkuzo/5/base -> origin/gh/vkuzo/5/base 2025-09-07T07:51:36.4688992Z * [new branch] gh/vkuzo/5/head -> origin/gh/vkuzo/5/head 2025-09-07T07:51:36.4690568Z * [new branch] gh/vkuzo/5/orig -> origin/gh/vkuzo/5/orig 2025-09-07T07:51:36.4692979Z * [new branch] gh/vkuzo/6/base -> origin/gh/vkuzo/6/base 2025-09-07T07:51:36.4694370Z * [new branch] gh/vkuzo/6/head -> origin/gh/vkuzo/6/head 2025-09-07T07:51:36.4696253Z * [new branch] gh/vkuzo/6/orig -> origin/gh/vkuzo/6/orig 2025-09-07T07:51:36.4698294Z * [new branch] gh/vkuzo/7/base -> origin/gh/vkuzo/7/base 2025-09-07T07:51:36.4700209Z * [new branch] gh/vkuzo/7/head -> origin/gh/vkuzo/7/head 2025-09-07T07:51:36.4701964Z * [new branch] gh/vkuzo/7/orig -> origin/gh/vkuzo/7/orig 2025-09-07T07:51:36.4704826Z * [new branch] gh/wconstab/419/base -> origin/gh/wconstab/419/base 2025-09-07T07:51:36.4706767Z * [new branch] gh/wconstab/419/head -> origin/gh/wconstab/419/head 2025-09-07T07:51:36.4708232Z * [new branch] gh/wconstab/419/orig -> origin/gh/wconstab/419/orig 2025-09-07T07:51:36.4710522Z * [new branch] gh/wconstab/424/base -> origin/gh/wconstab/424/base 2025-09-07T07:51:36.4712017Z * [new branch] gh/wconstab/424/head -> origin/gh/wconstab/424/head 2025-09-07T07:51:36.4713580Z * [new branch] gh/wconstab/424/orig -> origin/gh/wconstab/424/orig 2025-09-07T07:51:36.4716121Z * [new branch] gh/wconstab/435/base -> origin/gh/wconstab/435/base 2025-09-07T07:51:36.4717941Z * [new branch] gh/wconstab/435/head -> origin/gh/wconstab/435/head 2025-09-07T07:51:36.4719678Z * [new branch] gh/wconstab/435/orig -> origin/gh/wconstab/435/orig 2025-09-07T07:51:36.4721864Z * [new branch] gh/wconstab/438/base -> origin/gh/wconstab/438/base 2025-09-07T07:51:36.4723552Z * [new branch] gh/wconstab/438/head -> origin/gh/wconstab/438/head 2025-09-07T07:51:36.4724910Z * [new branch] gh/wconstab/438/orig -> origin/gh/wconstab/438/orig 2025-09-07T07:51:36.4727430Z * [new branch] gh/wconstab/440/base -> origin/gh/wconstab/440/base 2025-09-07T07:51:36.4729453Z * [new branch] gh/wconstab/440/head -> origin/gh/wconstab/440/head 2025-09-07T07:51:36.4731070Z * [new branch] gh/wconstab/440/orig -> origin/gh/wconstab/440/orig 2025-09-07T07:51:36.4733354Z * [new branch] gh/wconstab/441/base -> origin/gh/wconstab/441/base 2025-09-07T07:51:36.4734913Z * [new branch] gh/wconstab/441/head -> origin/gh/wconstab/441/head 2025-09-07T07:51:36.4736793Z * [new branch] gh/wconstab/441/orig -> origin/gh/wconstab/441/orig 2025-09-07T07:51:36.4739459Z * [new branch] gh/wconstab/442/base -> origin/gh/wconstab/442/base 2025-09-07T07:51:36.4741104Z * [new branch] gh/wconstab/442/head -> origin/gh/wconstab/442/head 2025-09-07T07:51:36.4742854Z * [new branch] gh/wconstab/442/orig -> origin/gh/wconstab/442/orig 2025-09-07T07:51:36.4745081Z * [new branch] gh/wconstab/443/base -> origin/gh/wconstab/443/base 2025-09-07T07:51:36.4746796Z * [new branch] gh/wconstab/443/head -> origin/gh/wconstab/443/head 2025-09-07T07:51:36.4748301Z * [new branch] gh/wconstab/443/orig -> origin/gh/wconstab/443/orig 2025-09-07T07:51:36.4750540Z * [new branch] gh/wconstab/444/base -> origin/gh/wconstab/444/base 2025-09-07T07:51:36.4752161Z * [new branch] gh/wconstab/444/head -> origin/gh/wconstab/444/head 2025-09-07T07:51:36.4753706Z * [new branch] gh/wconstab/444/orig -> origin/gh/wconstab/444/orig 2025-09-07T07:51:36.4756189Z * [new branch] gh/wconstab/445/base -> origin/gh/wconstab/445/base 2025-09-07T07:51:36.4757690Z * [new branch] gh/wconstab/445/head -> origin/gh/wconstab/445/head 2025-09-07T07:51:36.4759251Z * [new branch] gh/wconstab/445/orig -> origin/gh/wconstab/445/orig 2025-09-07T07:51:36.4761957Z * [new branch] gh/wconstab/446/base -> origin/gh/wconstab/446/base 2025-09-07T07:51:36.4763700Z * [new branch] gh/wconstab/446/head -> origin/gh/wconstab/446/head 2025-09-07T07:51:36.4766045Z * [new branch] gh/wconstab/446/orig -> origin/gh/wconstab/446/orig 2025-09-07T07:51:36.4768195Z * [new branch] gh/wconstab/447/base -> origin/gh/wconstab/447/base 2025-09-07T07:51:36.4769687Z * [new branch] gh/wconstab/447/head -> origin/gh/wconstab/447/head 2025-09-07T07:51:36.4771286Z * [new branch] gh/wconstab/447/orig -> origin/gh/wconstab/447/orig 2025-09-07T07:51:36.4774351Z * [new branch] gh/weifengpy/27/base -> origin/gh/weifengpy/27/base 2025-09-07T07:51:36.4776060Z * [new branch] gh/weifengpy/27/head -> origin/gh/weifengpy/27/head 2025-09-07T07:51:36.4777578Z * [new branch] gh/weifengpy/27/orig -> origin/gh/weifengpy/27/orig 2025-09-07T07:51:36.4779767Z * [new branch] gh/weifengpy/30/base -> origin/gh/weifengpy/30/base 2025-09-07T07:51:36.4781288Z * [new branch] gh/weifengpy/30/head -> origin/gh/weifengpy/30/head 2025-09-07T07:51:36.4782927Z * [new branch] gh/weifengpy/30/orig -> origin/gh/weifengpy/30/orig 2025-09-07T07:51:36.4785956Z * [new branch] gh/williamwen42/196/base -> origin/gh/williamwen42/196/base 2025-09-07T07:51:36.4787569Z * [new branch] gh/williamwen42/196/head -> origin/gh/williamwen42/196/head 2025-09-07T07:51:36.4789231Z * [new branch] gh/williamwen42/196/orig -> origin/gh/williamwen42/196/orig 2025-09-07T07:51:36.4791466Z * [new branch] gh/williamwen42/250/base -> origin/gh/williamwen42/250/base 2025-09-07T07:51:36.4793067Z * [new branch] gh/williamwen42/250/head -> origin/gh/williamwen42/250/head 2025-09-07T07:51:36.4794616Z * [new branch] gh/williamwen42/250/orig -> origin/gh/williamwen42/250/orig 2025-09-07T07:51:36.4797298Z * [new branch] gh/williamwen42/258/base -> origin/gh/williamwen42/258/base 2025-09-07T07:51:36.4799269Z * [new branch] gh/williamwen42/258/head -> origin/gh/williamwen42/258/head 2025-09-07T07:51:36.4802672Z * [new branch] gh/williamwen42/258/orig -> origin/gh/williamwen42/258/orig 2025-09-07T07:51:36.4803553Z * [new branch] gh/williamwen42/266/base -> origin/gh/williamwen42/266/base 2025-09-07T07:51:36.4804313Z * [new branch] gh/williamwen42/266/head -> origin/gh/williamwen42/266/head 2025-09-07T07:51:36.4806238Z * [new branch] gh/williamwen42/266/orig -> origin/gh/williamwen42/266/orig 2025-09-07T07:51:36.4808423Z * [new branch] gh/williamwen42/267/base -> origin/gh/williamwen42/267/base 2025-09-07T07:51:36.4810010Z * [new branch] gh/williamwen42/267/head -> origin/gh/williamwen42/267/head 2025-09-07T07:51:36.4811698Z * [new branch] gh/williamwen42/267/orig -> origin/gh/williamwen42/267/orig 2025-09-07T07:51:36.4814152Z * [new branch] gh/williamwen42/270/base -> origin/gh/williamwen42/270/base 2025-09-07T07:51:36.4816037Z * [new branch] gh/williamwen42/270/head -> origin/gh/williamwen42/270/head 2025-09-07T07:51:36.4817632Z * [new branch] gh/williamwen42/270/orig -> origin/gh/williamwen42/270/orig 2025-09-07T07:51:36.4819858Z * [new branch] gh/williamwen42/271/base -> origin/gh/williamwen42/271/base 2025-09-07T07:51:36.4821407Z * [new branch] gh/williamwen42/271/head -> origin/gh/williamwen42/271/head 2025-09-07T07:51:36.4823255Z * [new branch] gh/williamwen42/271/orig -> origin/gh/williamwen42/271/orig 2025-09-07T07:51:36.4825471Z * [new branch] gh/williamwen42/272/base -> origin/gh/williamwen42/272/base 2025-09-07T07:51:36.4827095Z * [new branch] gh/williamwen42/272/head -> origin/gh/williamwen42/272/head 2025-09-07T07:51:36.4828855Z * [new branch] gh/williamwen42/272/orig -> origin/gh/williamwen42/272/orig 2025-09-07T07:51:36.4831247Z * [new branch] gh/williamwen42/274/base -> origin/gh/williamwen42/274/base 2025-09-07T07:51:36.4832808Z * [new branch] gh/williamwen42/274/head -> origin/gh/williamwen42/274/head 2025-09-07T07:51:36.4834408Z * [new branch] gh/williamwen42/274/orig -> origin/gh/williamwen42/274/orig 2025-09-07T07:51:36.4836955Z * [new branch] gh/williamwen42/275/base -> origin/gh/williamwen42/275/base 2025-09-07T07:51:36.4838670Z * [new branch] gh/williamwen42/275/head -> origin/gh/williamwen42/275/head 2025-09-07T07:51:36.4840671Z * [new branch] gh/williamwen42/276/base -> origin/gh/williamwen42/276/base 2025-09-07T07:51:36.4842214Z * [new branch] gh/williamwen42/276/head -> origin/gh/williamwen42/276/head 2025-09-07T07:51:36.4843768Z * [new branch] gh/williamwen42/276/orig -> origin/gh/williamwen42/276/orig 2025-09-07T07:51:36.4846411Z * [new branch] gh/williamwen42/277/base -> origin/gh/williamwen42/277/base 2025-09-07T07:51:36.4847969Z * [new branch] gh/williamwen42/277/head -> origin/gh/williamwen42/277/head 2025-09-07T07:51:36.4849452Z * [new branch] gh/williamwen42/277/orig -> origin/gh/williamwen42/277/orig 2025-09-07T07:51:36.4851724Z * [new branch] gh/williamwen42/278/base -> origin/gh/williamwen42/278/base 2025-09-07T07:51:36.4853314Z * [new branch] gh/williamwen42/278/head -> origin/gh/williamwen42/278/head 2025-09-07T07:51:36.4854830Z * [new branch] gh/williamwen42/278/orig -> origin/gh/williamwen42/278/orig 2025-09-07T07:51:36.4857495Z * [new branch] gh/williamwen42/279/base -> origin/gh/williamwen42/279/base 2025-09-07T07:51:36.4859343Z * [new branch] gh/williamwen42/279/head -> origin/gh/williamwen42/279/head 2025-09-07T07:51:36.4860822Z * [new branch] gh/williamwen42/279/orig -> origin/gh/williamwen42/279/orig 2025-09-07T07:51:36.4863297Z * [new branch] gh/williamwen42/280/base -> origin/gh/williamwen42/280/base 2025-09-07T07:51:36.4864764Z * [new branch] gh/williamwen42/280/head -> origin/gh/williamwen42/280/head 2025-09-07T07:51:36.4866550Z * [new branch] gh/williamwen42/280/orig -> origin/gh/williamwen42/280/orig 2025-09-07T07:51:36.4868965Z * [new branch] gh/williamwen42/281/base -> origin/gh/williamwen42/281/base 2025-09-07T07:51:36.4870748Z * [new branch] gh/williamwen42/281/head -> origin/gh/williamwen42/281/head 2025-09-07T07:51:36.4872317Z * [new branch] gh/williamwen42/281/orig -> origin/gh/williamwen42/281/orig 2025-09-07T07:51:36.4874378Z * [new branch] gh/williamwen42/282/base -> origin/gh/williamwen42/282/base 2025-09-07T07:51:36.4876236Z * [new branch] gh/williamwen42/282/head -> origin/gh/williamwen42/282/head 2025-09-07T07:51:36.4877800Z * [new branch] gh/williamwen42/282/orig -> origin/gh/williamwen42/282/orig 2025-09-07T07:51:36.4880279Z * [new branch] gh/williamwen42/283/base -> origin/gh/williamwen42/283/base 2025-09-07T07:51:36.4881903Z * [new branch] gh/williamwen42/283/head -> origin/gh/williamwen42/283/head 2025-09-07T07:51:36.4883439Z * [new branch] gh/williamwen42/283/orig -> origin/gh/williamwen42/283/orig 2025-09-07T07:51:36.4886181Z * [new branch] gh/williamwen42/284/base -> origin/gh/williamwen42/284/base 2025-09-07T07:51:36.4887766Z * [new branch] gh/williamwen42/284/head -> origin/gh/williamwen42/284/head 2025-09-07T07:51:36.4889279Z * [new branch] gh/williamwen42/284/orig -> origin/gh/williamwen42/284/orig 2025-09-07T07:51:36.4891442Z * [new branch] gh/williamwen42/285/base -> origin/gh/williamwen42/285/base 2025-09-07T07:51:36.4892982Z * [new branch] gh/williamwen42/285/head -> origin/gh/williamwen42/285/head 2025-09-07T07:51:36.4894565Z * [new branch] gh/williamwen42/285/orig -> origin/gh/williamwen42/285/orig 2025-09-07T07:51:36.4896871Z * [new branch] gh/williamwen42/286/base -> origin/gh/williamwen42/286/base 2025-09-07T07:51:36.4898362Z * [new branch] gh/williamwen42/286/head -> origin/gh/williamwen42/286/head 2025-09-07T07:51:36.4899807Z * [new branch] gh/williamwen42/286/orig -> origin/gh/williamwen42/286/orig 2025-09-07T07:51:36.4902395Z * [new branch] gh/williamwen42/287/base -> origin/gh/williamwen42/287/base 2025-09-07T07:51:36.4903914Z * [new branch] gh/williamwen42/287/head -> origin/gh/williamwen42/287/head 2025-09-07T07:51:36.4905734Z * [new branch] gh/williamwen42/287/orig -> origin/gh/williamwen42/287/orig 2025-09-07T07:51:36.4908274Z * [new branch] gh/williamwen42/288/base -> origin/gh/williamwen42/288/base 2025-09-07T07:51:36.4909925Z * [new branch] gh/williamwen42/288/head -> origin/gh/williamwen42/288/head 2025-09-07T07:51:36.4911513Z * [new branch] gh/williamwen42/288/orig -> origin/gh/williamwen42/288/orig 2025-09-07T07:51:36.4913714Z * [new branch] gh/williamwen42/289/base -> origin/gh/williamwen42/289/base 2025-09-07T07:51:36.4915339Z * [new branch] gh/williamwen42/289/head -> origin/gh/williamwen42/289/head 2025-09-07T07:51:36.4917118Z * [new branch] gh/williamwen42/289/orig -> origin/gh/williamwen42/289/orig 2025-09-07T07:51:36.4920098Z * [new branch] gh/wychi/1/base -> origin/gh/wychi/1/base 2025-09-07T07:51:36.4921655Z * [new branch] gh/wychi/1/head -> origin/gh/wychi/1/head 2025-09-07T07:51:36.4923288Z * [new branch] gh/wychi/1/orig -> origin/gh/wychi/1/orig 2025-09-07T07:51:36.4926345Z * [new branch] gh/xmfan/169/base -> origin/gh/xmfan/169/base 2025-09-07T07:51:36.4927985Z * [new branch] gh/xmfan/169/head -> origin/gh/xmfan/169/head 2025-09-07T07:51:36.4930039Z * [new branch] gh/xmfan/170/base -> origin/gh/xmfan/170/base 2025-09-07T07:51:36.4931500Z * [new branch] gh/xmfan/170/head -> origin/gh/xmfan/170/head 2025-09-07T07:51:36.4933780Z * [new branch] gh/xmfan/18/base -> origin/gh/xmfan/18/base 2025-09-07T07:51:36.4935524Z * [new branch] gh/xmfan/18/head -> origin/gh/xmfan/18/head 2025-09-07T07:51:36.4937775Z * [new branch] gh/xmfan/229/base -> origin/gh/xmfan/229/base 2025-09-07T07:51:36.4939275Z * [new branch] gh/xmfan/229/head -> origin/gh/xmfan/229/head 2025-09-07T07:51:36.4940776Z * [new branch] gh/xmfan/229/orig -> origin/gh/xmfan/229/orig 2025-09-07T07:51:36.4943070Z * [new branch] gh/xmfan/237/base -> origin/gh/xmfan/237/base 2025-09-07T07:51:36.4944682Z * [new branch] gh/xmfan/237/head -> origin/gh/xmfan/237/head 2025-09-07T07:51:36.4946548Z * [new branch] gh/xmfan/237/orig -> origin/gh/xmfan/237/orig 2025-09-07T07:51:36.4948799Z * [new branch] gh/xmfan/244/base -> origin/gh/xmfan/244/base 2025-09-07T07:51:36.4950368Z * [new branch] gh/xmfan/244/head -> origin/gh/xmfan/244/head 2025-09-07T07:51:36.4951921Z * [new branch] gh/xmfan/244/orig -> origin/gh/xmfan/244/orig 2025-09-07T07:51:36.4954131Z * [new branch] gh/xmfan/246/base -> origin/gh/xmfan/246/base 2025-09-07T07:51:36.4955936Z * [new branch] gh/xmfan/246/head -> origin/gh/xmfan/246/head 2025-09-07T07:51:36.4957488Z * [new branch] gh/xmfan/246/orig -> origin/gh/xmfan/246/orig 2025-09-07T07:51:36.4959644Z * [new branch] gh/xmfan/253/base -> origin/gh/xmfan/253/base 2025-09-07T07:51:36.4961163Z * [new branch] gh/xmfan/253/head -> origin/gh/xmfan/253/head 2025-09-07T07:51:36.4962706Z * [new branch] gh/xmfan/253/orig -> origin/gh/xmfan/253/orig 2025-09-07T07:51:36.4964825Z * [new branch] gh/xmfan/254/base -> origin/gh/xmfan/254/base 2025-09-07T07:51:36.4966732Z * [new branch] gh/xmfan/254/head -> origin/gh/xmfan/254/head 2025-09-07T07:51:36.4968221Z * [new branch] gh/xmfan/254/orig -> origin/gh/xmfan/254/orig 2025-09-07T07:51:36.4970571Z * [new branch] gh/xmfan/260/base -> origin/gh/xmfan/260/base 2025-09-07T07:51:36.4972017Z * [new branch] gh/xmfan/260/head -> origin/gh/xmfan/260/head 2025-09-07T07:51:36.4973471Z * [new branch] gh/xmfan/260/orig -> origin/gh/xmfan/260/orig 2025-09-07T07:51:36.4975874Z * [new branch] gh/xmfan/262/base -> origin/gh/xmfan/262/base 2025-09-07T07:51:36.4977360Z * [new branch] gh/xmfan/262/head -> origin/gh/xmfan/262/head 2025-09-07T07:51:36.4978814Z * [new branch] gh/xmfan/262/orig -> origin/gh/xmfan/262/orig 2025-09-07T07:51:36.4981218Z * [new branch] gh/xmfan/263/base -> origin/gh/xmfan/263/base 2025-09-07T07:51:36.4982844Z * [new branch] gh/xmfan/263/head -> origin/gh/xmfan/263/head 2025-09-07T07:51:36.4984355Z * [new branch] gh/xmfan/263/orig -> origin/gh/xmfan/263/orig 2025-09-07T07:51:36.4986801Z * [new branch] gh/xmfan/264/base -> origin/gh/xmfan/264/base 2025-09-07T07:51:36.4988357Z * [new branch] gh/xmfan/264/head -> origin/gh/xmfan/264/head 2025-09-07T07:51:36.4989836Z * [new branch] gh/xmfan/264/orig -> origin/gh/xmfan/264/orig 2025-09-07T07:51:36.4992079Z * [new branch] gh/xmfan/274/base -> origin/gh/xmfan/274/base 2025-09-07T07:51:36.4993685Z * [new branch] gh/xmfan/274/head -> origin/gh/xmfan/274/head 2025-09-07T07:51:36.4995337Z * [new branch] gh/xmfan/274/orig -> origin/gh/xmfan/274/orig 2025-09-07T07:51:36.4997677Z * [new branch] gh/xmfan/276/base -> origin/gh/xmfan/276/base 2025-09-07T07:51:36.4999330Z * [new branch] gh/xmfan/276/head -> origin/gh/xmfan/276/head 2025-09-07T07:51:36.5001036Z * [new branch] gh/xmfan/276/orig -> origin/gh/xmfan/276/orig 2025-09-07T07:51:36.5003204Z * [new branch] gh/xmfan/277/base -> origin/gh/xmfan/277/base 2025-09-07T07:51:36.5004732Z * [new branch] gh/xmfan/277/head -> origin/gh/xmfan/277/head 2025-09-07T07:51:36.5006568Z * [new branch] gh/xmfan/277/orig -> origin/gh/xmfan/277/orig 2025-09-07T07:51:36.5008784Z * [new branch] gh/xmfan/278/base -> origin/gh/xmfan/278/base 2025-09-07T07:51:36.5010292Z * [new branch] gh/xmfan/278/head -> origin/gh/xmfan/278/head 2025-09-07T07:51:36.5011780Z * [new branch] gh/xmfan/278/orig -> origin/gh/xmfan/278/orig 2025-09-07T07:51:36.5013930Z * [new branch] gh/xmfan/279/base -> origin/gh/xmfan/279/base 2025-09-07T07:51:36.5015718Z * [new branch] gh/xmfan/279/head -> origin/gh/xmfan/279/head 2025-09-07T07:51:36.5017222Z * [new branch] gh/xmfan/279/orig -> origin/gh/xmfan/279/orig 2025-09-07T07:51:36.5019656Z * [new branch] gh/xmfan/280/base -> origin/gh/xmfan/280/base 2025-09-07T07:51:36.5021315Z * [new branch] gh/xmfan/280/head -> origin/gh/xmfan/280/head 2025-09-07T07:51:36.5022991Z * [new branch] gh/xmfan/280/orig -> origin/gh/xmfan/280/orig 2025-09-07T07:51:36.5025332Z * [new branch] gh/xmfan/281/base -> origin/gh/xmfan/281/base 2025-09-07T07:51:36.5027055Z * [new branch] gh/xmfan/281/head -> origin/gh/xmfan/281/head 2025-09-07T07:51:36.5028600Z * [new branch] gh/xmfan/281/orig -> origin/gh/xmfan/281/orig 2025-09-07T07:51:36.5030767Z * [new branch] gh/xmfan/282/base -> origin/gh/xmfan/282/base 2025-09-07T07:51:36.5032411Z * [new branch] gh/xmfan/282/head -> origin/gh/xmfan/282/head 2025-09-07T07:51:36.5034546Z * [new branch] gh/xmfan/283/base -> origin/gh/xmfan/283/base 2025-09-07T07:51:36.5036819Z * [new branch] gh/xmfan/283/head -> origin/gh/xmfan/283/head 2025-09-07T07:51:36.5038195Z * [new branch] gh/xmfan/283/orig -> origin/gh/xmfan/283/orig 2025-09-07T07:51:36.5040951Z * [new branch] gh/xuanzhang816/14/base -> origin/gh/xuanzhang816/14/base 2025-09-07T07:51:36.5045746Z * [new branch] gh/xuanzhang816/14/head -> origin/gh/xuanzhang816/14/head 2025-09-07T07:51:36.5047329Z * [new branch] gh/xuanzhang816/14/orig -> origin/gh/xuanzhang816/14/orig 2025-09-07T07:51:36.5049545Z * [new branch] gh/xuanzhang816/19/base -> origin/gh/xuanzhang816/19/base 2025-09-07T07:51:36.5051024Z * [new branch] gh/xuanzhang816/19/head -> origin/gh/xuanzhang816/19/head 2025-09-07T07:51:36.5052598Z * [new branch] gh/xuanzhang816/19/orig -> origin/gh/xuanzhang816/19/orig 2025-09-07T07:51:36.5054859Z * [new branch] gh/xuanzhang816/22/base -> origin/gh/xuanzhang816/22/base 2025-09-07T07:51:36.5056654Z * [new branch] gh/xuanzhang816/22/head -> origin/gh/xuanzhang816/22/head 2025-09-07T07:51:36.5058224Z * [new branch] gh/xuanzhang816/22/orig -> origin/gh/xuanzhang816/22/orig 2025-09-07T07:51:36.5060350Z * [new branch] gh/xuanzhang816/23/base -> origin/gh/xuanzhang816/23/base 2025-09-07T07:51:36.5061985Z * [new branch] gh/xuanzhang816/23/head -> origin/gh/xuanzhang816/23/head 2025-09-07T07:51:36.5063572Z * [new branch] gh/xuanzhang816/23/orig -> origin/gh/xuanzhang816/23/orig 2025-09-07T07:51:36.5065928Z * [new branch] gh/xuanzhang816/24/base -> origin/gh/xuanzhang816/24/base 2025-09-07T07:51:36.5067449Z * [new branch] gh/xuanzhang816/24/head -> origin/gh/xuanzhang816/24/head 2025-09-07T07:51:36.5069137Z * [new branch] gh/xuanzhang816/24/orig -> origin/gh/xuanzhang816/24/orig 2025-09-07T07:51:36.5071495Z * [new branch] gh/xuanzhang816/25/base -> origin/gh/xuanzhang816/25/base 2025-09-07T07:51:36.5073009Z * [new branch] gh/xuanzhang816/25/head -> origin/gh/xuanzhang816/25/head 2025-09-07T07:51:36.5074489Z * [new branch] gh/xuanzhang816/25/orig -> origin/gh/xuanzhang816/25/orig 2025-09-07T07:51:36.5077113Z * [new branch] gh/xuanzhang816/26/base -> origin/gh/xuanzhang816/26/base 2025-09-07T07:51:36.5078904Z * [new branch] gh/xuanzhang816/26/head -> origin/gh/xuanzhang816/26/head 2025-09-07T07:51:36.5080404Z * [new branch] gh/xuanzhang816/26/orig -> origin/gh/xuanzhang816/26/orig 2025-09-07T07:51:36.5083218Z * [new branch] gh/yanbing-j/11/base -> origin/gh/yanbing-j/11/base 2025-09-07T07:51:36.5084811Z * [new branch] gh/yanbing-j/11/head -> origin/gh/yanbing-j/11/head 2025-09-07T07:51:36.5086753Z * [new branch] gh/yanbing-j/11/orig -> origin/gh/yanbing-j/11/orig 2025-09-07T07:51:36.5088867Z * [new branch] gh/yanbing-j/12/base -> origin/gh/yanbing-j/12/base 2025-09-07T07:51:36.5090413Z * [new branch] gh/yanbing-j/12/head -> origin/gh/yanbing-j/12/head 2025-09-07T07:51:36.5091948Z * [new branch] gh/yanbing-j/12/orig -> origin/gh/yanbing-j/12/orig 2025-09-07T07:51:36.5094212Z * [new branch] gh/yanbing-j/13/base -> origin/gh/yanbing-j/13/base 2025-09-07T07:51:36.5096102Z * [new branch] gh/yanbing-j/13/head -> origin/gh/yanbing-j/13/head 2025-09-07T07:51:36.5097806Z * [new branch] gh/yanbing-j/13/orig -> origin/gh/yanbing-j/13/orig 2025-09-07T07:51:36.5099958Z * [new branch] gh/yanbing-j/14/base -> origin/gh/yanbing-j/14/base 2025-09-07T07:51:36.5101609Z * [new branch] gh/yanbing-j/14/head -> origin/gh/yanbing-j/14/head 2025-09-07T07:51:36.5103466Z * [new branch] gh/yanbing-j/14/orig -> origin/gh/yanbing-j/14/orig 2025-09-07T07:51:36.5105499Z * [new branch] gh/yanbing-j/15/base -> origin/gh/yanbing-j/15/base 2025-09-07T07:51:36.5107146Z * [new branch] gh/yanbing-j/15/head -> origin/gh/yanbing-j/15/head 2025-09-07T07:51:36.5108785Z * [new branch] gh/yanbing-j/15/orig -> origin/gh/yanbing-j/15/orig 2025-09-07T07:51:36.5111162Z * [new branch] gh/yanbing-j/18/base -> origin/gh/yanbing-j/18/base 2025-09-07T07:51:36.5112728Z * [new branch] gh/yanbing-j/18/head -> origin/gh/yanbing-j/18/head 2025-09-07T07:51:36.5114319Z * [new branch] gh/yanbing-j/18/orig -> origin/gh/yanbing-j/18/orig 2025-09-07T07:51:36.5116771Z * [new branch] gh/yanbing-j/19/base -> origin/gh/yanbing-j/19/base 2025-09-07T07:51:36.5118400Z * [new branch] gh/yanbing-j/19/head -> origin/gh/yanbing-j/19/head 2025-09-07T07:51:36.5120029Z * [new branch] gh/yanbing-j/19/orig -> origin/gh/yanbing-j/19/orig 2025-09-07T07:51:36.5122276Z * [new branch] gh/yanbing-j/20/base -> origin/gh/yanbing-j/20/base 2025-09-07T07:51:36.5123840Z * [new branch] gh/yanbing-j/20/head -> origin/gh/yanbing-j/20/head 2025-09-07T07:51:36.5125398Z * [new branch] gh/yanbing-j/20/orig -> origin/gh/yanbing-j/20/orig 2025-09-07T07:51:36.5127821Z * [new branch] gh/yanbing-j/21/base -> origin/gh/yanbing-j/21/base 2025-09-07T07:51:36.5129381Z * [new branch] gh/yanbing-j/21/head -> origin/gh/yanbing-j/21/head 2025-09-07T07:51:36.5131616Z * [new branch] gh/yanbing-j/22/base -> origin/gh/yanbing-j/22/base 2025-09-07T07:51:36.5133118Z * [new branch] gh/yanbing-j/22/head -> origin/gh/yanbing-j/22/head 2025-09-07T07:51:36.5134668Z * [new branch] gh/yanbing-j/22/orig -> origin/gh/yanbing-j/22/orig 2025-09-07T07:51:36.5137338Z * [new branch] gh/yanbing-j/23/base -> origin/gh/yanbing-j/23/base 2025-09-07T07:51:36.5138777Z * [new branch] gh/yanbing-j/23/head -> origin/gh/yanbing-j/23/head 2025-09-07T07:51:36.5140268Z * [new branch] gh/yanbing-j/23/orig -> origin/gh/yanbing-j/23/orig 2025-09-07T07:51:36.5142614Z * [new branch] gh/yanbing-j/24/base -> origin/gh/yanbing-j/24/base 2025-09-07T07:51:36.5144078Z * [new branch] gh/yanbing-j/24/head -> origin/gh/yanbing-j/24/head 2025-09-07T07:51:36.5145924Z * [new branch] gh/yanbing-j/24/orig -> origin/gh/yanbing-j/24/orig 2025-09-07T07:51:36.5148183Z * [new branch] gh/yanbing-j/25/base -> origin/gh/yanbing-j/25/base 2025-09-07T07:51:36.5149850Z * [new branch] gh/yanbing-j/25/head -> origin/gh/yanbing-j/25/head 2025-09-07T07:51:36.5151435Z * [new branch] gh/yanbing-j/25/orig -> origin/gh/yanbing-j/25/orig 2025-09-07T07:51:36.5153679Z * [new branch] gh/yanbing-j/26/base -> origin/gh/yanbing-j/26/base 2025-09-07T07:51:36.5155278Z * [new branch] gh/yanbing-j/26/head -> origin/gh/yanbing-j/26/head 2025-09-07T07:51:36.5157029Z * [new branch] gh/yanbing-j/26/orig -> origin/gh/yanbing-j/26/orig 2025-09-07T07:51:36.5159585Z * [new branch] gh/yanbing-j/36/base -> origin/gh/yanbing-j/36/base 2025-09-07T07:51:36.5161137Z * [new branch] gh/yanbing-j/36/head -> origin/gh/yanbing-j/36/head 2025-09-07T07:51:36.5162607Z * [new branch] gh/yanbing-j/36/orig -> origin/gh/yanbing-j/36/orig 2025-09-07T07:51:36.5164755Z * [new branch] gh/yanbing-j/37/base -> origin/gh/yanbing-j/37/base 2025-09-07T07:51:36.5166698Z * [new branch] gh/yanbing-j/37/head -> origin/gh/yanbing-j/37/head 2025-09-07T07:51:36.5168328Z * [new branch] gh/yanbing-j/37/orig -> origin/gh/yanbing-j/37/orig 2025-09-07T07:51:36.5170982Z * [new branch] gh/yangw-dev/12/base -> origin/gh/yangw-dev/12/base 2025-09-07T07:51:36.5172512Z * [new branch] gh/yangw-dev/12/head -> origin/gh/yangw-dev/12/head 2025-09-07T07:51:36.5174081Z * [new branch] gh/yangw-dev/12/orig -> origin/gh/yangw-dev/12/orig 2025-09-07T07:51:36.5176507Z * [new branch] gh/yangw-dev/13/base -> origin/gh/yangw-dev/13/base 2025-09-07T07:51:36.5178083Z * [new branch] gh/yangw-dev/13/head -> origin/gh/yangw-dev/13/head 2025-09-07T07:51:36.5179623Z * [new branch] gh/yangw-dev/13/orig -> origin/gh/yangw-dev/13/orig 2025-09-07T07:51:36.5181941Z * [new branch] gh/yangw-dev/14/base -> origin/gh/yangw-dev/14/base 2025-09-07T07:51:36.5183624Z * [new branch] gh/yangw-dev/14/head -> origin/gh/yangw-dev/14/head 2025-09-07T07:51:36.5185269Z * [new branch] gh/yangw-dev/14/orig -> origin/gh/yangw-dev/14/orig 2025-09-07T07:51:36.5187827Z * [new branch] gh/yangw-dev/15/base -> origin/gh/yangw-dev/15/base 2025-09-07T07:51:36.5189474Z * [new branch] gh/yangw-dev/15/head -> origin/gh/yangw-dev/15/head 2025-09-07T07:51:36.5191153Z * [new branch] gh/yangw-dev/15/orig -> origin/gh/yangw-dev/15/orig 2025-09-07T07:51:36.5193553Z * [new branch] gh/yangw-dev/16/base -> origin/gh/yangw-dev/16/base 2025-09-07T07:51:36.5195263Z * [new branch] gh/yangw-dev/16/head -> origin/gh/yangw-dev/16/head 2025-09-07T07:51:36.5197007Z * [new branch] gh/yangw-dev/16/orig -> origin/gh/yangw-dev/16/orig 2025-09-07T07:51:36.5199167Z * [new branch] gh/yangw-dev/17/base -> origin/gh/yangw-dev/17/base 2025-09-07T07:51:36.5200797Z * [new branch] gh/yangw-dev/17/head -> origin/gh/yangw-dev/17/head 2025-09-07T07:51:36.5202333Z * [new branch] gh/yangw-dev/17/orig -> origin/gh/yangw-dev/17/orig 2025-09-07T07:51:36.5204608Z * [new branch] gh/yangw-dev/18/base -> origin/gh/yangw-dev/18/base 2025-09-07T07:51:36.5206666Z * [new branch] gh/yangw-dev/18/head -> origin/gh/yangw-dev/18/head 2025-09-07T07:51:36.5208203Z * [new branch] gh/yangw-dev/18/orig -> origin/gh/yangw-dev/18/orig 2025-09-07T07:51:36.5210434Z * [new branch] gh/yangw-dev/19/base -> origin/gh/yangw-dev/19/base 2025-09-07T07:51:36.5212025Z * [new branch] gh/yangw-dev/19/head -> origin/gh/yangw-dev/19/head 2025-09-07T07:51:36.5213615Z * [new branch] gh/yangw-dev/19/orig -> origin/gh/yangw-dev/19/orig 2025-09-07T07:51:36.5216396Z * [new branch] gh/yangw-dev/20/base -> origin/gh/yangw-dev/20/base 2025-09-07T07:51:36.5217885Z * [new branch] gh/yangw-dev/20/head -> origin/gh/yangw-dev/20/head 2025-09-07T07:51:36.5219397Z * [new branch] gh/yangw-dev/20/orig -> origin/gh/yangw-dev/20/orig 2025-09-07T07:51:36.5221662Z * [new branch] gh/yangw-dev/21/base -> origin/gh/yangw-dev/21/base 2025-09-07T07:51:36.5223367Z * [new branch] gh/yangw-dev/21/head -> origin/gh/yangw-dev/21/head 2025-09-07T07:51:36.5225191Z * [new branch] gh/yangw-dev/21/orig -> origin/gh/yangw-dev/21/orig 2025-09-07T07:51:36.5227558Z * [new branch] gh/yangw-dev/22/base -> origin/gh/yangw-dev/22/base 2025-09-07T07:51:36.5229156Z * [new branch] gh/yangw-dev/22/head -> origin/gh/yangw-dev/22/head 2025-09-07T07:51:36.5230696Z * [new branch] gh/yangw-dev/22/orig -> origin/gh/yangw-dev/22/orig 2025-09-07T07:51:36.5232829Z * [new branch] gh/yangw-dev/23/base -> origin/gh/yangw-dev/23/base 2025-09-07T07:51:36.5234618Z * [new branch] gh/yangw-dev/23/head -> origin/gh/yangw-dev/23/head 2025-09-07T07:51:36.5236335Z * [new branch] gh/yangw-dev/23/orig -> origin/gh/yangw-dev/23/orig 2025-09-07T07:51:36.5238637Z * [new branch] gh/yangw-dev/24/base -> origin/gh/yangw-dev/24/base 2025-09-07T07:51:36.5240179Z * [new branch] gh/yangw-dev/24/head -> origin/gh/yangw-dev/24/head 2025-09-07T07:51:36.5241698Z * [new branch] gh/yangw-dev/24/orig -> origin/gh/yangw-dev/24/orig 2025-09-07T07:51:36.5243937Z * [new branch] gh/yangw-dev/25/base -> origin/gh/yangw-dev/25/base 2025-09-07T07:51:36.5245777Z * [new branch] gh/yangw-dev/25/head -> origin/gh/yangw-dev/25/head 2025-09-07T07:51:36.5247307Z * [new branch] gh/yangw-dev/25/orig -> origin/gh/yangw-dev/25/orig 2025-09-07T07:51:36.5249538Z * [new branch] gh/yangw-dev/26/base -> origin/gh/yangw-dev/26/base 2025-09-07T07:51:36.5251108Z * [new branch] gh/yangw-dev/26/head -> origin/gh/yangw-dev/26/head 2025-09-07T07:51:36.5252657Z * [new branch] gh/yangw-dev/26/orig -> origin/gh/yangw-dev/26/orig 2025-09-07T07:51:36.5254905Z * [new branch] gh/yangw-dev/27/base -> origin/gh/yangw-dev/27/base 2025-09-07T07:51:36.5256789Z * [new branch] gh/yangw-dev/27/head -> origin/gh/yangw-dev/27/head 2025-09-07T07:51:36.5258443Z * [new branch] gh/yangw-dev/27/orig -> origin/gh/yangw-dev/27/orig 2025-09-07T07:51:36.5261553Z * [new branch] gh/ydwu4/233/base -> origin/gh/ydwu4/233/base 2025-09-07T07:51:36.5263240Z * [new branch] gh/ydwu4/233/head -> origin/gh/ydwu4/233/head 2025-09-07T07:51:36.5264818Z * [new branch] gh/ydwu4/233/orig -> origin/gh/ydwu4/233/orig 2025-09-07T07:51:36.5267506Z * [new branch] gh/ydwu4/246/base -> origin/gh/ydwu4/246/base 2025-09-07T07:51:36.5269171Z * [new branch] gh/ydwu4/246/head -> origin/gh/ydwu4/246/head 2025-09-07T07:51:36.5271017Z * [new branch] gh/ydwu4/246/orig -> origin/gh/ydwu4/246/orig 2025-09-07T07:51:36.5273304Z * [new branch] gh/ydwu4/253/base -> origin/gh/ydwu4/253/base 2025-09-07T07:51:36.5275059Z * [new branch] gh/ydwu4/253/head -> origin/gh/ydwu4/253/head 2025-09-07T07:51:36.5276769Z * [new branch] gh/ydwu4/253/orig -> origin/gh/ydwu4/253/orig 2025-09-07T07:51:36.5279369Z * [new branch] gh/ydwu4/255/base -> origin/gh/ydwu4/255/base 2025-09-07T07:51:36.5280889Z * [new branch] gh/ydwu4/255/head -> origin/gh/ydwu4/255/head 2025-09-07T07:51:36.5282542Z * [new branch] gh/ydwu4/255/orig -> origin/gh/ydwu4/255/orig 2025-09-07T07:51:36.5284817Z * [new branch] gh/ydwu4/259/base -> origin/gh/ydwu4/259/base 2025-09-07T07:51:36.5286900Z * [new branch] gh/ydwu4/259/head -> origin/gh/ydwu4/259/head 2025-09-07T07:51:36.5288416Z * [new branch] gh/ydwu4/259/orig -> origin/gh/ydwu4/259/orig 2025-09-07T07:51:36.5290710Z * [new branch] gh/ydwu4/262/base -> origin/gh/ydwu4/262/base 2025-09-07T07:51:36.5292433Z * [new branch] gh/ydwu4/262/head -> origin/gh/ydwu4/262/head 2025-09-07T07:51:36.5294020Z * [new branch] gh/ydwu4/262/orig -> origin/gh/ydwu4/262/orig 2025-09-07T07:51:36.5296593Z * [new branch] gh/ydwu4/263/base -> origin/gh/ydwu4/263/base 2025-09-07T07:51:36.5298143Z * [new branch] gh/ydwu4/263/head -> origin/gh/ydwu4/263/head 2025-09-07T07:51:36.5299692Z * [new branch] gh/ydwu4/263/orig -> origin/gh/ydwu4/263/orig 2025-09-07T07:51:36.5302222Z * [new branch] gh/ydwu4/269/base -> origin/gh/ydwu4/269/base 2025-09-07T07:51:36.5303952Z * [new branch] gh/ydwu4/269/head -> origin/gh/ydwu4/269/head 2025-09-07T07:51:36.5305623Z * [new branch] gh/ydwu4/269/orig -> origin/gh/ydwu4/269/orig 2025-09-07T07:51:36.5307916Z * [new branch] gh/ydwu4/270/base -> origin/gh/ydwu4/270/base 2025-09-07T07:51:36.5309583Z * [new branch] gh/ydwu4/270/head -> origin/gh/ydwu4/270/head 2025-09-07T07:51:36.5311113Z * [new branch] gh/ydwu4/270/orig -> origin/gh/ydwu4/270/orig 2025-09-07T07:51:36.5313472Z * [new branch] gh/ydwu4/272/base -> origin/gh/ydwu4/272/base 2025-09-07T07:51:36.5315323Z * [new branch] gh/ydwu4/272/head -> origin/gh/ydwu4/272/head 2025-09-07T07:51:36.5316982Z * [new branch] gh/ydwu4/272/orig -> origin/gh/ydwu4/272/orig 2025-09-07T07:51:36.5319071Z * [new branch] gh/ydwu4/275/base -> origin/gh/ydwu4/275/base 2025-09-07T07:51:36.5320722Z * [new branch] gh/ydwu4/275/head -> origin/gh/ydwu4/275/head 2025-09-07T07:51:36.5322256Z * [new branch] gh/ydwu4/275/orig -> origin/gh/ydwu4/275/orig 2025-09-07T07:51:36.5324403Z * [new branch] gh/ydwu4/276/base -> origin/gh/ydwu4/276/base 2025-09-07T07:51:36.5326466Z * [new branch] gh/ydwu4/276/head -> origin/gh/ydwu4/276/head 2025-09-07T07:51:36.5328094Z * [new branch] gh/ydwu4/276/orig -> origin/gh/ydwu4/276/orig 2025-09-07T07:51:36.5330577Z * [new branch] gh/ydwu4/279/base -> origin/gh/ydwu4/279/base 2025-09-07T07:51:36.5332327Z * [new branch] gh/ydwu4/279/head -> origin/gh/ydwu4/279/head 2025-09-07T07:51:36.5333876Z * [new branch] gh/ydwu4/279/orig -> origin/gh/ydwu4/279/orig 2025-09-07T07:51:36.5336685Z * [new branch] gh/ydwu4/283/base -> origin/gh/ydwu4/283/base 2025-09-07T07:51:36.5338277Z * [new branch] gh/ydwu4/283/head -> origin/gh/ydwu4/283/head 2025-09-07T07:51:36.5339931Z * [new branch] gh/ydwu4/283/orig -> origin/gh/ydwu4/283/orig 2025-09-07T07:51:36.5342247Z * [new branch] gh/ydwu4/289/base -> origin/gh/ydwu4/289/base 2025-09-07T07:51:36.5343812Z * [new branch] gh/ydwu4/289/head -> origin/gh/ydwu4/289/head 2025-09-07T07:51:36.5345484Z * [new branch] gh/ydwu4/289/orig -> origin/gh/ydwu4/289/orig 2025-09-07T07:51:36.5347956Z * [new branch] gh/ydwu4/290/base -> origin/gh/ydwu4/290/base 2025-09-07T07:51:36.5349854Z * [new branch] gh/ydwu4/290/head -> origin/gh/ydwu4/290/head 2025-09-07T07:51:36.5351986Z * [new branch] gh/ydwu4/290/orig -> origin/gh/ydwu4/290/orig 2025-09-07T07:51:36.5354322Z * [new branch] gh/ydwu4/291/base -> origin/gh/ydwu4/291/base 2025-09-07T07:51:36.5356302Z * [new branch] gh/ydwu4/291/head -> origin/gh/ydwu4/291/head 2025-09-07T07:51:36.5357892Z * [new branch] gh/ydwu4/291/orig -> origin/gh/ydwu4/291/orig 2025-09-07T07:51:36.5360294Z * [new branch] gh/ydwu4/292/base -> origin/gh/ydwu4/292/base 2025-09-07T07:51:36.5361794Z * [new branch] gh/ydwu4/292/head -> origin/gh/ydwu4/292/head 2025-09-07T07:51:36.5363357Z * [new branch] gh/ydwu4/292/orig -> origin/gh/ydwu4/292/orig 2025-09-07T07:51:36.5365850Z * [new branch] gh/ydwu4/293/base -> origin/gh/ydwu4/293/base 2025-09-07T07:51:36.5367466Z * [new branch] gh/ydwu4/293/head -> origin/gh/ydwu4/293/head 2025-09-07T07:51:36.5369348Z * [new branch] gh/ydwu4/293/orig -> origin/gh/ydwu4/293/orig 2025-09-07T07:51:36.5371587Z * [new branch] gh/ydwu4/294/base -> origin/gh/ydwu4/294/base 2025-09-07T07:51:36.5373379Z * [new branch] gh/ydwu4/294/head -> origin/gh/ydwu4/294/head 2025-09-07T07:51:36.5374795Z * [new branch] gh/ydwu4/294/orig -> origin/gh/ydwu4/294/orig 2025-09-07T07:51:36.5377318Z * [new branch] gh/ydwu4/295/base -> origin/gh/ydwu4/295/base 2025-09-07T07:51:36.5378912Z * [new branch] gh/ydwu4/295/head -> origin/gh/ydwu4/295/head 2025-09-07T07:51:36.5380492Z * [new branch] gh/ydwu4/295/orig -> origin/gh/ydwu4/295/orig 2025-09-07T07:51:36.5382925Z * [new branch] gh/ydwu4/296/base -> origin/gh/ydwu4/296/base 2025-09-07T07:51:36.5384482Z * [new branch] gh/ydwu4/296/head -> origin/gh/ydwu4/296/head 2025-09-07T07:51:36.5386341Z * [new branch] gh/ydwu4/296/orig -> origin/gh/ydwu4/296/orig 2025-09-07T07:51:36.5389405Z * [new branch] gh/ydwu4/300/base -> origin/gh/ydwu4/300/base 2025-09-07T07:51:36.5391509Z * [new branch] gh/ydwu4/300/head -> origin/gh/ydwu4/300/head 2025-09-07T07:51:36.5393250Z * [new branch] gh/ydwu4/300/orig -> origin/gh/ydwu4/300/orig 2025-09-07T07:51:36.5396060Z * [new branch] gh/ydwu4/301/base -> origin/gh/ydwu4/301/base 2025-09-07T07:51:36.5397581Z * [new branch] gh/ydwu4/301/head -> origin/gh/ydwu4/301/head 2025-09-07T07:51:36.5399273Z * [new branch] gh/ydwu4/301/orig -> origin/gh/ydwu4/301/orig 2025-09-07T07:51:36.5401578Z * [new branch] gh/ydwu4/302/base -> origin/gh/ydwu4/302/base 2025-09-07T07:51:36.5403167Z * [new branch] gh/ydwu4/302/head -> origin/gh/ydwu4/302/head 2025-09-07T07:51:36.5404754Z * [new branch] gh/ydwu4/302/orig -> origin/gh/ydwu4/302/orig 2025-09-07T07:51:36.5407160Z * [new branch] gh/ydwu4/303/base -> origin/gh/ydwu4/303/base 2025-09-07T07:51:36.5408727Z * [new branch] gh/ydwu4/303/head -> origin/gh/ydwu4/303/head 2025-09-07T07:51:36.5410317Z * [new branch] gh/ydwu4/303/orig -> origin/gh/ydwu4/303/orig 2025-09-07T07:51:36.5412478Z * [new branch] gh/ydwu4/304/base -> origin/gh/ydwu4/304/base 2025-09-07T07:51:36.5414065Z * [new branch] gh/ydwu4/304/head -> origin/gh/ydwu4/304/head 2025-09-07T07:51:36.5415893Z * [new branch] gh/ydwu4/304/orig -> origin/gh/ydwu4/304/orig 2025-09-07T07:51:36.6310399Z * [new branch] gh/ydwu4/305/base -> origin/gh/ydwu4/305/base 2025-09-07T07:51:36.6312101Z * [new branch] gh/ydwu4/305/head -> origin/gh/ydwu4/305/head 2025-09-07T07:51:36.6313697Z * [new branch] gh/ydwu4/305/orig -> origin/gh/ydwu4/305/orig 2025-09-07T07:51:36.6316505Z * [new branch] gh/ydwu4/306/base -> origin/gh/ydwu4/306/base 2025-09-07T07:51:36.6318123Z * [new branch] gh/ydwu4/306/head -> origin/gh/ydwu4/306/head 2025-09-07T07:51:36.6319738Z * [new branch] gh/ydwu4/306/orig -> origin/gh/ydwu4/306/orig 2025-09-07T07:51:36.6321982Z * [new branch] gh/ydwu4/307/base -> origin/gh/ydwu4/307/base 2025-09-07T07:51:36.6323451Z * [new branch] gh/ydwu4/307/head -> origin/gh/ydwu4/307/head 2025-09-07T07:51:36.6324927Z * [new branch] gh/ydwu4/307/orig -> origin/gh/ydwu4/307/orig 2025-09-07T07:51:36.6327867Z * [new branch] gh/ydwu4/308/base -> origin/gh/ydwu4/308/base 2025-09-07T07:51:36.6329444Z * [new branch] gh/ydwu4/308/head -> origin/gh/ydwu4/308/head 2025-09-07T07:51:36.6331045Z * [new branch] gh/ydwu4/308/orig -> origin/gh/ydwu4/308/orig 2025-09-07T07:51:36.6333230Z * [new branch] gh/ydwu4/309/base -> origin/gh/ydwu4/309/base 2025-09-07T07:51:36.6335179Z * [new branch] gh/ydwu4/309/head -> origin/gh/ydwu4/309/head 2025-09-07T07:51:36.6336718Z * [new branch] gh/ydwu4/309/orig -> origin/gh/ydwu4/309/orig 2025-09-07T07:51:36.6339191Z * [new branch] gh/ydwu4/310/base -> origin/gh/ydwu4/310/base 2025-09-07T07:51:36.6341124Z * [new branch] gh/ydwu4/310/head -> origin/gh/ydwu4/310/head 2025-09-07T07:51:36.6342793Z * [new branch] gh/ydwu4/310/orig -> origin/gh/ydwu4/310/orig 2025-09-07T07:51:36.6345228Z * [new branch] gh/ydwu4/311/base -> origin/gh/ydwu4/311/base 2025-09-07T07:51:36.6346967Z * [new branch] gh/ydwu4/311/head -> origin/gh/ydwu4/311/head 2025-09-07T07:51:36.6348508Z * [new branch] gh/ydwu4/311/orig -> origin/gh/ydwu4/311/orig 2025-09-07T07:51:36.6350830Z * [new branch] gh/ydwu4/312/base -> origin/gh/ydwu4/312/base 2025-09-07T07:51:36.6352400Z * [new branch] gh/ydwu4/312/head -> origin/gh/ydwu4/312/head 2025-09-07T07:51:36.6354005Z * [new branch] gh/ydwu4/312/orig -> origin/gh/ydwu4/312/orig 2025-09-07T07:51:36.6356759Z * [new branch] gh/ydwu4/313/base -> origin/gh/ydwu4/313/base 2025-09-07T07:51:36.6358665Z * [new branch] gh/ydwu4/313/head -> origin/gh/ydwu4/313/head 2025-09-07T07:51:36.6360361Z * [new branch] gh/ydwu4/313/orig -> origin/gh/ydwu4/313/orig 2025-09-07T07:51:36.6362667Z * [new branch] gh/ydwu4/314/base -> origin/gh/ydwu4/314/base 2025-09-07T07:51:36.6364495Z * [new branch] gh/ydwu4/314/head -> origin/gh/ydwu4/314/head 2025-09-07T07:51:36.6366305Z * [new branch] gh/ydwu4/314/orig -> origin/gh/ydwu4/314/orig 2025-09-07T07:51:36.6368757Z * [new branch] gh/ydwu4/315/base -> origin/gh/ydwu4/315/base 2025-09-07T07:51:36.6370354Z * [new branch] gh/ydwu4/315/head -> origin/gh/ydwu4/315/head 2025-09-07T07:51:36.6372078Z * [new branch] gh/ydwu4/315/orig -> origin/gh/ydwu4/315/orig 2025-09-07T07:51:36.6374424Z * [new branch] gh/ydwu4/316/base -> origin/gh/ydwu4/316/base 2025-09-07T07:51:36.6376316Z * [new branch] gh/ydwu4/316/head -> origin/gh/ydwu4/316/head 2025-09-07T07:51:36.6378088Z * [new branch] gh/ydwu4/316/orig -> origin/gh/ydwu4/316/orig 2025-09-07T07:51:36.6380571Z * [new branch] gh/ydwu4/317/base -> origin/gh/ydwu4/317/base 2025-09-07T07:51:36.6382209Z * [new branch] gh/ydwu4/317/head -> origin/gh/ydwu4/317/head 2025-09-07T07:51:36.6383831Z * [new branch] gh/ydwu4/317/orig -> origin/gh/ydwu4/317/orig 2025-09-07T07:51:36.6386407Z * [new branch] gh/ydwu4/318/base -> origin/gh/ydwu4/318/base 2025-09-07T07:51:36.6388036Z * [new branch] gh/ydwu4/318/head -> origin/gh/ydwu4/318/head 2025-09-07T07:51:36.6389718Z * [new branch] gh/ydwu4/318/orig -> origin/gh/ydwu4/318/orig 2025-09-07T07:51:36.6391990Z * [new branch] gh/ydwu4/319/base -> origin/gh/ydwu4/319/base 2025-09-07T07:51:36.6393530Z * [new branch] gh/ydwu4/319/head -> origin/gh/ydwu4/319/head 2025-09-07T07:51:36.6395245Z * [new branch] gh/ydwu4/319/orig -> origin/gh/ydwu4/319/orig 2025-09-07T07:51:36.6397780Z * [new branch] gh/ydwu4/320/base -> origin/gh/ydwu4/320/base 2025-09-07T07:51:36.6399245Z * [new branch] gh/ydwu4/320/head -> origin/gh/ydwu4/320/head 2025-09-07T07:51:36.6400771Z * [new branch] gh/ydwu4/320/orig -> origin/gh/ydwu4/320/orig 2025-09-07T07:51:36.6402882Z * [new branch] gh/ydwu4/321/base -> origin/gh/ydwu4/321/base 2025-09-07T07:51:36.6404676Z * [new branch] gh/ydwu4/321/head -> origin/gh/ydwu4/321/head 2025-09-07T07:51:36.6406500Z * [new branch] gh/ydwu4/321/orig -> origin/gh/ydwu4/321/orig 2025-09-07T07:51:36.6408815Z * [new branch] gh/ydwu4/322/base -> origin/gh/ydwu4/322/base 2025-09-07T07:51:36.6410390Z * [new branch] gh/ydwu4/322/head -> origin/gh/ydwu4/322/head 2025-09-07T07:51:36.6411991Z * [new branch] gh/ydwu4/322/orig -> origin/gh/ydwu4/322/orig 2025-09-07T07:51:36.6414277Z * [new branch] gh/ydwu4/323/base -> origin/gh/ydwu4/323/base 2025-09-07T07:51:36.6416101Z * [new branch] gh/ydwu4/323/head -> origin/gh/ydwu4/323/head 2025-09-07T07:51:36.6417653Z * [new branch] gh/ydwu4/323/orig -> origin/gh/ydwu4/323/orig 2025-09-07T07:51:36.6419959Z * [new branch] gh/ydwu4/324/base -> origin/gh/ydwu4/324/base 2025-09-07T07:51:36.6421664Z * [new branch] gh/ydwu4/324/head -> origin/gh/ydwu4/324/head 2025-09-07T07:51:36.6423174Z * [new branch] gh/ydwu4/324/orig -> origin/gh/ydwu4/324/orig 2025-09-07T07:51:36.6426424Z * [new branch] gh/yf225/133/base -> origin/gh/yf225/133/base 2025-09-07T07:51:36.6428086Z * [new branch] gh/yf225/133/head -> origin/gh/yf225/133/head 2025-09-07T07:51:36.6430581Z * [new branch] gh/yf225/171/base -> origin/gh/yf225/171/base 2025-09-07T07:51:36.6432260Z * [new branch] gh/yf225/171/head -> origin/gh/yf225/171/head 2025-09-07T07:51:36.6433815Z * [new branch] gh/yf225/171/orig -> origin/gh/yf225/171/orig 2025-09-07T07:51:36.6436497Z * [new branch] gh/yf225/172/base -> origin/gh/yf225/172/base 2025-09-07T07:51:36.6438091Z * [new branch] gh/yf225/172/head -> origin/gh/yf225/172/head 2025-09-07T07:51:36.6439844Z * [new branch] gh/yf225/172/orig -> origin/gh/yf225/172/orig 2025-09-07T07:51:36.6442048Z * [new branch] gh/yf225/93/base -> origin/gh/yf225/93/base 2025-09-07T07:51:36.6443605Z * [new branch] gh/yf225/93/head -> origin/gh/yf225/93/head 2025-09-07T07:51:36.6447268Z * [new branch] gh/yifuwang/152/base -> origin/gh/yifuwang/152/base 2025-09-07T07:51:36.6449005Z * [new branch] gh/yifuwang/152/head -> origin/gh/yifuwang/152/head 2025-09-07T07:51:36.6450652Z * [new branch] gh/yifuwang/152/orig -> origin/gh/yifuwang/152/orig 2025-09-07T07:51:36.6452940Z * [new branch] gh/yifuwang/195/base -> origin/gh/yifuwang/195/base 2025-09-07T07:51:36.6454556Z * [new branch] gh/yifuwang/195/head -> origin/gh/yifuwang/195/head 2025-09-07T07:51:36.6456419Z * [new branch] gh/yifuwang/195/orig -> origin/gh/yifuwang/195/orig 2025-09-07T07:51:36.6459297Z * [new branch] gh/yiming0416/1/base -> origin/gh/yiming0416/1/base 2025-09-07T07:51:36.6460938Z * [new branch] gh/yiming0416/1/head -> origin/gh/yiming0416/1/head 2025-09-07T07:51:36.6463225Z * [new branch] gh/yiming0416/2/base -> origin/gh/yiming0416/2/base 2025-09-07T07:51:36.6464701Z * [new branch] gh/yiming0416/2/head -> origin/gh/yiming0416/2/head 2025-09-07T07:51:36.6467734Z * [new branch] gh/ysiraichi/79/base -> origin/gh/ysiraichi/79/base 2025-09-07T07:51:36.6469346Z * [new branch] gh/ysiraichi/79/head -> origin/gh/ysiraichi/79/head 2025-09-07T07:51:36.6471307Z * [new branch] gh/ysiraichi/79/orig -> origin/gh/ysiraichi/79/orig 2025-09-07T07:51:36.6473572Z * [new branch] gh/ysiraichi/88/base -> origin/gh/ysiraichi/88/base 2025-09-07T07:51:36.6475476Z * [new branch] gh/ysiraichi/88/head -> origin/gh/ysiraichi/88/head 2025-09-07T07:51:36.6476936Z * [new branch] gh/ysiraichi/88/orig -> origin/gh/ysiraichi/88/orig 2025-09-07T07:51:36.6480093Z * [new branch] gh/zhxchen17/25/base -> origin/gh/zhxchen17/25/base 2025-09-07T07:51:36.6481894Z * [new branch] gh/zhxchen17/25/head -> origin/gh/zhxchen17/25/head 2025-09-07T07:51:36.6483369Z * [new branch] gh/zhxchen17/25/orig -> origin/gh/zhxchen17/25/orig 2025-09-07T07:51:36.6486054Z * [new branch] gh/zhxchen17/31/base -> origin/gh/zhxchen17/31/base 2025-09-07T07:51:36.6487743Z * [new branch] gh/zhxchen17/31/head -> origin/gh/zhxchen17/31/head 2025-09-07T07:51:36.6489291Z * [new branch] gh/zhxchen17/31/orig -> origin/gh/zhxchen17/31/orig 2025-09-07T07:51:36.6491488Z * [new branch] gh/zhxchen17/34/base -> origin/gh/zhxchen17/34/base 2025-09-07T07:51:36.6493111Z * [new branch] gh/zhxchen17/34/head -> origin/gh/zhxchen17/34/head 2025-09-07T07:51:36.6495305Z * [new branch] gh/zhxchen17/35/base -> origin/gh/zhxchen17/35/base 2025-09-07T07:51:36.6497113Z * [new branch] gh/zhxchen17/35/head -> origin/gh/zhxchen17/35/head 2025-09-07T07:51:36.6499624Z * [new branch] gh/zhxchen17/37/base -> origin/gh/zhxchen17/37/base 2025-09-07T07:51:36.6501219Z * [new branch] gh/zhxchen17/37/head -> origin/gh/zhxchen17/37/head 2025-09-07T07:51:36.6503007Z * [new branch] gh/zhxchen17/37/orig -> origin/gh/zhxchen17/37/orig 2025-09-07T07:51:36.6505506Z * [new branch] gh/zhxchen17/38/base -> origin/gh/zhxchen17/38/base 2025-09-07T07:51:36.6507133Z * [new branch] gh/zhxchen17/38/head -> origin/gh/zhxchen17/38/head 2025-09-07T07:51:36.6508842Z * [new branch] gh/zhxchen17/38/orig -> origin/gh/zhxchen17/38/orig 2025-09-07T07:51:36.6511203Z * [new branch] gh/zhxchen17/39/base -> origin/gh/zhxchen17/39/base 2025-09-07T07:51:36.6512764Z * [new branch] gh/zhxchen17/39/head -> origin/gh/zhxchen17/39/head 2025-09-07T07:51:36.6514331Z * [new branch] gh/zhxchen17/39/orig -> origin/gh/zhxchen17/39/orig 2025-09-07T07:51:36.6517034Z * [new branch] gh/zhxchen17/40/base -> origin/gh/zhxchen17/40/base 2025-09-07T07:51:36.6518774Z * [new branch] gh/zhxchen17/40/head -> origin/gh/zhxchen17/40/head 2025-09-07T07:51:36.6520505Z * [new branch] gh/zhxchen17/40/orig -> origin/gh/zhxchen17/40/orig 2025-09-07T07:51:36.6522701Z * [new branch] gh/zhxchen17/41/base -> origin/gh/zhxchen17/41/base 2025-09-07T07:51:36.6524473Z * [new branch] gh/zhxchen17/41/head -> origin/gh/zhxchen17/41/head 2025-09-07T07:51:36.6526515Z * [new branch] gh/zhxchen17/41/orig -> origin/gh/zhxchen17/41/orig 2025-09-07T07:51:36.6528882Z * [new branch] gh/zhxchen17/42/base -> origin/gh/zhxchen17/42/base 2025-09-07T07:51:36.6530665Z * [new branch] gh/zhxchen17/42/head -> origin/gh/zhxchen17/42/head 2025-09-07T07:51:36.6532373Z * [new branch] gh/zhxchen17/42/orig -> origin/gh/zhxchen17/42/orig 2025-09-07T07:51:36.6534826Z * [new branch] gh/zhxchen17/43/base -> origin/gh/zhxchen17/43/base 2025-09-07T07:51:36.6536785Z * [new branch] gh/zhxchen17/43/head -> origin/gh/zhxchen17/43/head 2025-09-07T07:51:36.6538297Z * [new branch] gh/zhxchen17/43/orig -> origin/gh/zhxchen17/43/orig 2025-09-07T07:51:36.6540927Z * [new branch] gh/zhxchen17/44/base -> origin/gh/zhxchen17/44/base 2025-09-07T07:51:36.6542458Z * [new branch] gh/zhxchen17/44/head -> origin/gh/zhxchen17/44/head 2025-09-07T07:51:36.6544170Z * [new branch] gh/zhxchen17/44/orig -> origin/gh/zhxchen17/44/orig 2025-09-07T07:51:36.6546663Z * [new branch] gh/zhxchen17/45/base -> origin/gh/zhxchen17/45/base 2025-09-07T07:51:36.6548357Z * [new branch] gh/zhxchen17/45/head -> origin/gh/zhxchen17/45/head 2025-09-07T07:51:36.6565743Z * [new branch] gh/zhxchen17/45/orig -> origin/gh/zhxchen17/45/orig 2025-09-07T07:51:36.6566199Z * [new branch] gh/zklaus/10/base -> origin/gh/zklaus/10/base 2025-09-07T07:51:36.6566588Z * [new branch] gh/zklaus/10/head -> origin/gh/zklaus/10/head 2025-09-07T07:51:36.6566977Z * [new branch] gh/zklaus/10/orig -> origin/gh/zklaus/10/orig 2025-09-07T07:51:36.6567370Z * [new branch] gh/zklaus/11/base -> origin/gh/zklaus/11/base 2025-09-07T07:51:36.6567750Z * [new branch] gh/zklaus/11/head -> origin/gh/zklaus/11/head 2025-09-07T07:51:36.6568219Z * [new branch] gh/zklaus/11/orig -> origin/gh/zklaus/11/orig 2025-09-07T07:51:36.6568683Z * [new branch] gh/zklaus/12/base -> origin/gh/zklaus/12/base 2025-09-07T07:51:36.6569144Z * [new branch] gh/zklaus/12/head -> origin/gh/zklaus/12/head 2025-09-07T07:51:36.6569588Z * [new branch] gh/zklaus/12/orig -> origin/gh/zklaus/12/orig 2025-09-07T07:51:36.6570363Z * [new branch] gh/zklaus/14/base -> origin/gh/zklaus/14/base 2025-09-07T07:51:36.6571864Z * [new branch] gh/zklaus/14/head -> origin/gh/zklaus/14/head 2025-09-07T07:51:36.6573417Z * [new branch] gh/zklaus/14/orig -> origin/gh/zklaus/14/orig 2025-09-07T07:51:36.6575900Z * [new branch] gh/zklaus/15/base -> origin/gh/zklaus/15/base 2025-09-07T07:51:36.6577632Z * [new branch] gh/zklaus/15/head -> origin/gh/zklaus/15/head 2025-09-07T07:51:36.6579424Z * [new branch] gh/zklaus/15/orig -> origin/gh/zklaus/15/orig 2025-09-07T07:51:36.6581880Z * [new branch] gh/zklaus/16/base -> origin/gh/zklaus/16/base 2025-09-07T07:51:36.6583413Z * [new branch] gh/zklaus/16/head -> origin/gh/zklaus/16/head 2025-09-07T07:51:36.6585119Z * [new branch] gh/zklaus/16/orig -> origin/gh/zklaus/16/orig 2025-09-07T07:51:36.6587494Z * [new branch] gh/zklaus/17/base -> origin/gh/zklaus/17/base 2025-09-07T07:51:36.6589037Z * [new branch] gh/zklaus/17/head -> origin/gh/zklaus/17/head 2025-09-07T07:51:36.6590609Z * [new branch] gh/zklaus/17/orig -> origin/gh/zklaus/17/orig 2025-09-07T07:51:36.6592725Z * [new branch] gh/zklaus/18/base -> origin/gh/zklaus/18/base 2025-09-07T07:51:36.6594309Z * [new branch] gh/zklaus/18/head -> origin/gh/zklaus/18/head 2025-09-07T07:51:36.6596184Z * [new branch] gh/zklaus/18/orig -> origin/gh/zklaus/18/orig 2025-09-07T07:51:36.6598551Z * [new branch] gh/zklaus/19/base -> origin/gh/zklaus/19/base 2025-09-07T07:51:36.6600232Z * [new branch] gh/zklaus/19/head -> origin/gh/zklaus/19/head 2025-09-07T07:51:36.6601833Z * [new branch] gh/zklaus/19/orig -> origin/gh/zklaus/19/orig 2025-09-07T07:51:36.6604048Z * [new branch] gh/zklaus/20/base -> origin/gh/zklaus/20/base 2025-09-07T07:51:36.6605807Z * [new branch] gh/zklaus/20/head -> origin/gh/zklaus/20/head 2025-09-07T07:51:36.6607414Z * [new branch] gh/zklaus/20/orig -> origin/gh/zklaus/20/orig 2025-09-07T07:51:36.6609679Z * [new branch] gh/zklaus/7/base -> origin/gh/zklaus/7/base 2025-09-07T07:51:36.6611188Z * [new branch] gh/zklaus/7/head -> origin/gh/zklaus/7/head 2025-09-07T07:51:36.6612914Z * [new branch] gh/zklaus/7/orig -> origin/gh/zklaus/7/orig 2025-09-07T07:51:36.6614907Z * [new branch] gh/zklaus/9/base -> origin/gh/zklaus/9/base 2025-09-07T07:51:36.6616753Z * [new branch] gh/zklaus/9/head -> origin/gh/zklaus/9/head 2025-09-07T07:51:36.6618316Z * [new branch] gh/zklaus/9/orig -> origin/gh/zklaus/9/orig 2025-09-07T07:51:36.6621293Z * [new branch] gh/zou3519/1175/base -> origin/gh/zou3519/1175/base 2025-09-07T07:51:36.6622988Z * [new branch] gh/zou3519/1175/head -> origin/gh/zou3519/1175/head 2025-09-07T07:51:36.6624528Z * [new branch] gh/zou3519/1175/orig -> origin/gh/zou3519/1175/orig 2025-09-07T07:51:36.6627051Z * [new branch] gh/zou3519/1177/base -> origin/gh/zou3519/1177/base 2025-09-07T07:51:36.6628719Z * [new branch] gh/zou3519/1177/head -> origin/gh/zou3519/1177/head 2025-09-07T07:51:36.6630573Z * [new branch] gh/zou3519/1177/orig -> origin/gh/zou3519/1177/orig 2025-09-07T07:51:36.6632831Z * [new branch] gh/zou3519/1191/base -> origin/gh/zou3519/1191/base 2025-09-07T07:51:36.6634570Z * [new branch] gh/zou3519/1191/head -> origin/gh/zou3519/1191/head 2025-09-07T07:51:36.6636678Z * [new branch] gh/zou3519/1191/orig -> origin/gh/zou3519/1191/orig 2025-09-07T07:51:36.6639279Z * [new branch] gh/zou3519/1192/base -> origin/gh/zou3519/1192/base 2025-09-07T07:51:36.6640863Z * [new branch] gh/zou3519/1192/head -> origin/gh/zou3519/1192/head 2025-09-07T07:51:36.6642450Z * [new branch] gh/zou3519/1192/orig -> origin/gh/zou3519/1192/orig 2025-09-07T07:51:36.6644576Z * [new branch] gh/zou3519/1193/base -> origin/gh/zou3519/1193/base 2025-09-07T07:51:36.6646504Z * [new branch] gh/zou3519/1193/head -> origin/gh/zou3519/1193/head 2025-09-07T07:51:36.6647979Z * [new branch] gh/zou3519/1193/orig -> origin/gh/zou3519/1193/orig 2025-09-07T07:51:36.6650001Z * [new branch] gh/zou3519/1194/base -> origin/gh/zou3519/1194/base 2025-09-07T07:51:36.6651609Z * [new branch] gh/zou3519/1194/head -> origin/gh/zou3519/1194/head 2025-09-07T07:51:36.6653255Z * [new branch] gh/zou3519/1194/orig -> origin/gh/zou3519/1194/orig 2025-09-07T07:51:36.6655851Z * [new branch] gh/zou3519/1195/base -> origin/gh/zou3519/1195/base 2025-09-07T07:51:36.6657364Z * [new branch] gh/zou3519/1195/head -> origin/gh/zou3519/1195/head 2025-09-07T07:51:36.6659211Z * [new branch] gh/zou3519/1195/orig -> origin/gh/zou3519/1195/orig 2025-09-07T07:51:36.6661323Z * [new branch] gh/zou3519/1196/base -> origin/gh/zou3519/1196/base 2025-09-07T07:51:36.6663027Z * [new branch] gh/zou3519/1196/head -> origin/gh/zou3519/1196/head 2025-09-07T07:51:36.6664751Z * [new branch] gh/zou3519/1196/orig -> origin/gh/zou3519/1196/orig 2025-09-07T07:51:36.6667091Z * [new branch] gh/zou3519/1197/base -> origin/gh/zou3519/1197/base 2025-09-07T07:51:36.6668678Z * [new branch] gh/zou3519/1197/head -> origin/gh/zou3519/1197/head 2025-09-07T07:51:36.6670201Z * [new branch] gh/zou3519/1197/orig -> origin/gh/zou3519/1197/orig 2025-09-07T07:51:36.6673160Z * [new branch] gh/zpcore/1/base -> origin/gh/zpcore/1/base 2025-09-07T07:51:36.6674645Z * [new branch] gh/zpcore/1/head -> origin/gh/zpcore/1/head 2025-09-07T07:51:36.6677237Z * [new branch] gh/zpcore/10/base -> origin/gh/zpcore/10/base 2025-09-07T07:51:36.6678581Z * [new branch] gh/zpcore/10/head -> origin/gh/zpcore/10/head 2025-09-07T07:51:36.6680262Z * [new branch] gh/zpcore/10/orig -> origin/gh/zpcore/10/orig 2025-09-07T07:51:36.6682276Z * [new branch] gh/zpcore/11/base -> origin/gh/zpcore/11/base 2025-09-07T07:51:36.6683981Z * [new branch] gh/zpcore/11/head -> origin/gh/zpcore/11/head 2025-09-07T07:51:36.6685754Z * [new branch] gh/zpcore/11/orig -> origin/gh/zpcore/11/orig 2025-09-07T07:51:36.6688120Z * [new branch] gh/zpcore/12/base -> origin/gh/zpcore/12/base 2025-09-07T07:51:36.6689936Z * [new branch] gh/zpcore/12/head -> origin/gh/zpcore/12/head 2025-09-07T07:51:36.6691525Z * [new branch] gh/zpcore/12/orig -> origin/gh/zpcore/12/orig 2025-09-07T07:51:36.6693723Z * [new branch] gh/zpcore/13/base -> origin/gh/zpcore/13/base 2025-09-07T07:51:36.6695577Z * [new branch] gh/zpcore/13/head -> origin/gh/zpcore/13/head 2025-09-07T07:51:36.6697290Z * [new branch] gh/zpcore/13/orig -> origin/gh/zpcore/13/orig 2025-09-07T07:51:36.6699439Z * [new branch] gh/zpcore/14/base -> origin/gh/zpcore/14/base 2025-09-07T07:51:36.6700975Z * [new branch] gh/zpcore/14/head -> origin/gh/zpcore/14/head 2025-09-07T07:51:36.6703314Z * [new branch] gh/zpcore/2/base -> origin/gh/zpcore/2/base 2025-09-07T07:51:36.6704873Z * [new branch] gh/zpcore/2/head -> origin/gh/zpcore/2/head 2025-09-07T07:51:36.6707155Z * [new branch] gh/zpcore/3/base -> origin/gh/zpcore/3/base 2025-09-07T07:51:36.6708735Z * [new branch] gh/zpcore/3/head -> origin/gh/zpcore/3/head 2025-09-07T07:51:36.6710780Z * [new branch] gh/zpcore/4/base -> origin/gh/zpcore/4/base 2025-09-07T07:51:36.6712342Z * [new branch] gh/zpcore/4/head -> origin/gh/zpcore/4/head 2025-09-07T07:51:36.6714428Z * [new branch] gh/zpcore/5/base -> origin/gh/zpcore/5/base 2025-09-07T07:51:36.6716221Z * [new branch] gh/zpcore/5/head -> origin/gh/zpcore/5/head 2025-09-07T07:51:36.6718205Z * [new branch] gh/zpcore/6/base -> origin/gh/zpcore/6/base 2025-09-07T07:51:36.6719721Z * [new branch] gh/zpcore/6/head -> origin/gh/zpcore/6/head 2025-09-07T07:51:36.6721778Z * [new branch] gh/zpcore/7/base -> origin/gh/zpcore/7/base 2025-09-07T07:51:36.6723250Z * [new branch] gh/zpcore/7/head -> origin/gh/zpcore/7/head 2025-09-07T07:51:36.6725598Z * [new branch] gh/zpcore/8/base -> origin/gh/zpcore/8/base 2025-09-07T07:51:36.6727201Z * [new branch] gh/zpcore/8/head -> origin/gh/zpcore/8/head 2025-09-07T07:51:36.6729166Z * [new branch] google-main -> origin/google-main 2025-09-07T07:51:36.6731533Z * [new branch] guangyey/external_stream -> origin/guangyey/external_stream 2025-09-07T07:51:36.6732987Z * [new branch] guangyey/host_alloc -> origin/guangyey/host_alloc 2025-09-07T07:51:36.6734373Z * [new branch] guangyey/reimport -> origin/guangyey/reimport 2025-09-07T07:51:36.6736134Z * [new branch] guangyey/test_2025 -> origin/guangyey/test_2025 2025-09-07T07:51:36.6738636Z * [new branch] guilhermeleobas/cherry-pick-55d87d9dfd9 -> origin/guilhermeleobas/cherry-pick-55d87d9dfd9 2025-09-07T07:51:36.6740855Z * [new branch] haozhe/bf16-dynamic-shape -> origin/haozhe/bf16-dynamic-shape 2025-09-07T07:51:36.6742794Z * [new branch] hc_baseline -> origin/hc_baseline 2025-09-07T07:51:36.6744691Z * [new branch] hf_update -> origin/hf_update 2025-09-07T07:51:36.6746968Z * [new branch] hhh_decomp_mul -> origin/hhh_decomp_mul 2025-09-07T07:51:36.6748932Z * [new branch] hhh_rand -> origin/hhh_rand 2025-09-07T07:51:36.6751222Z * [new branch] hoy/mmsplitk -> origin/hoy/mmsplitk 2025-09-07T07:51:36.6752668Z * [new branch] hoy/triton-PR3973 -> origin/hoy/triton-PR3973 2025-09-07T07:51:36.6754271Z * [new branch] hoy/triton-coalescing-baseline -> origin/hoy/triton-coalescing-baseline 2025-09-07T07:51:36.6756001Z * [new branch] hoy/triton-coalescing-new -> origin/hoy/triton-coalescing-new 2025-09-07T07:51:36.6757487Z * [new branch] hoy/triton-coalescing-vec -> origin/hoy/triton-coalescing-vec 2025-09-07T07:51:36.6759730Z * [new branch] inductordecompfix -> origin/inductordecompfix 2025-09-07T07:51:36.6761577Z * [new branch] inline -> origin/inline 2025-09-07T07:51:36.6763399Z * [new branch] inlining -> origin/inlining 2025-09-07T07:51:36.6765415Z * [new branch] inlining-ezyang -> origin/inlining-ezyang 2025-09-07T07:51:36.6767430Z * [new branch] install-torchao-0.13.0 -> origin/install-torchao-0.13.0 2025-09-07T07:51:36.6769166Z * [new branch] int8_sdpa -> origin/int8_sdpa 2025-09-07T07:51:36.6771099Z * [new branch] invoke-subgraph -> origin/invoke-subgraph 2025-09-07T07:51:36.6772886Z * [new branch] issue#58739 -> origin/issue#58739 2025-09-07T07:51:36.6775693Z * [new branch] jcaip/test-cusparselt-version-0.6.2 -> origin/jcaip/test-cusparselt-version-0.6.2 2025-09-07T07:51:36.6777228Z * [new branch] jcaip/update-cusparselt-0.6.2 -> origin/jcaip/update-cusparselt-0.6.2 2025-09-07T07:51:36.6779538Z * [new branch] jeanschmidt/disable_rocm_build_tests -> origin/jeanschmidt/disable_rocm_build_tests 2025-09-07T07:51:36.6781383Z * [new branch] jithunnair-amd-patch-1 -> origin/jithunnair-amd-patch-1 2025-09-07T07:51:36.6783385Z * [new branch] jithunnair-amd-patch-2 -> origin/jithunnair-amd-patch-2 2025-09-07T07:51:36.6786203Z * [new branch] justinchu/attention-tests -> origin/justinchu/attention-tests 2025-09-07T07:51:36.6787672Z * [new branch] justinchu/native-qdq -> origin/justinchu/native-qdq 2025-09-07T07:51:36.6789583Z * [new branch] justinchu/ort-122 -> origin/justinchu/ort-122 2025-09-07T07:51:36.6792158Z * [new branch] justinchuby/dynamo-true -> origin/justinchuby/dynamo-true 2025-09-07T07:51:36.6794422Z * [new branch] kainan666/xlf_debug -> origin/kainan666/xlf_debug 2025-09-07T07:51:36.6796556Z * [new branch] kainan_test -> origin/kainan_test 2025-09-07T07:51:36.6798522Z * [new branch] learnablebias -> origin/learnablebias 2025-09-07T07:51:36.6800981Z * [new branch] leslie/test_group_gemm_epilogues -> origin/leslie/test_group_gemm_epilogues 2025-09-07T07:51:36.6803293Z * [new branch] lessw2020/fix_cutlass_cache_error -> origin/lessw2020/fix_cutlass_cache_error 2025-09-07T07:51:36.6805800Z * [new branch] liaoxuan/shm_all_reduce -> origin/liaoxuan/shm_all_reduce 2025-09-07T07:51:36.6807470Z * [new branch] liaoxuan/test_fa_disable_softmax -> origin/liaoxuan/test_fa_disable_softmax 2025-09-07T07:51:36.6809148Z * [new branch] liaoxuan/test_int8_sdpa -> origin/liaoxuan/test_int8_sdpa 2025-09-07T07:51:36.6811103Z * [new branch] lintbuilddocker -> origin/lintbuilddocker 2025-09-07T07:51:36.6812833Z * [new branch] llama4-stable -> origin/llama4-stable 2025-09-07T07:51:36.6814706Z * [new branch] logdetfix -> origin/logdetfix 2025-09-07T07:51:36.6817939Z * [new branch] lts/release/1.8 -> origin/lts/release/1.8 2025-09-07T07:51:36.6820744Z * [new branch] lucaskabela/#94773 -> origin/lucaskabela/#94773 2025-09-07T07:51:36.6822344Z * [new branch] lucaskabela/flop_counter -> origin/lucaskabela/flop_counter 2025-09-07T07:51:36.6823821Z * [new branch] lucaskabela/func_under_decomp -> origin/lucaskabela/func_under_decomp 2025-09-07T07:51:36.6825508Z * [new branch] lucaskabela/functional_in_dynamo -> origin/lucaskabela/functional_in_dynamo 2025-09-07T07:51:36.6827228Z * [new branch] lucaskabela/install_params_as_graph_attr -> origin/lucaskabela/install_params_as_graph_attr 2025-09-07T07:51:36.6828830Z * [new branch] lucaskabela/issue_120648 -> origin/lucaskabela/issue_120648 2025-09-07T07:51:36.6830579Z * [new branch] lucaskabela/misc_typing_dynamo -> origin/lucaskabela/misc_typing_dynamo 2025-09-07T07:51:36.6832053Z * [new branch] lucaskabela/parameters_as_graph_attr -> origin/lucaskabela/parameters_as_graph_attr 2025-09-07T07:51:36.6833623Z * [new branch] lucaskabela/remove_aot_dispatcher_metadata -> origin/lucaskabela/remove_aot_dispatcher_metadata 2025-09-07T07:51:36.6835135Z * [new branch] lucaskabela/rnn_decomp -> origin/lucaskabela/rnn_decomp 2025-09-07T07:51:36.6836746Z * [new branch] lucaskabela/typing_backends -> origin/lucaskabela/typing_backends 2025-09-07T07:51:36.6838507Z * [new branch] lucaskabela/typing_symbolic_convert -> origin/lucaskabela/typing_symbolic_convert 2025-09-07T07:51:36.6840936Z * [new branch] lucaskabela/typing_utils_improvements -> origin/lucaskabela/typing_utils_improvements 2025-09-07T07:51:36.6842678Z * [new branch] main -> origin/main 2025-09-07T07:51:36.6844670Z * [new branch] main-enable-b200-distributed-tests -> origin/main-enable-b200-distributed-tests 2025-09-07T07:51:36.6846744Z * [new branch] malfet-patch-1 -> origin/malfet-patch-1 2025-09-07T07:51:36.6848653Z * [new branch] malfet-patch-12 -> origin/malfet-patch-12 2025-09-07T07:51:36.6850485Z * [new branch] malfet-patch-14 -> origin/malfet-patch-14 2025-09-07T07:51:36.6852481Z * [new branch] malfet-patch-6 -> origin/malfet-patch-6 2025-09-07T07:51:36.6854327Z * [new branch] malfet-patch-8 -> origin/malfet-patch-8 2025-09-07T07:51:36.6857184Z * [new branch] malfet/be-move-more-settings-to-checkout-pytorch -> origin/malfet/be-move-more-settings-to-checkout-pytorch 2025-09-07T07:51:36.6858941Z * [new branch] malfet/delete-upsteam-cuda -> origin/malfet/delete-upsteam-cuda 2025-09-07T07:51:36.6860581Z * [new branch] malfet/mps-implement-col2im -> origin/malfet/mps-implement-col2im 2025-09-07T07:51:36.6863065Z * [new branch] manuel/test-ops-common-allow-mps -> origin/manuel/test-ops-common-allow-mps 2025-09-07T07:51:36.6864867Z * [new branch] metascroy-patch-1 -> origin/metascroy-patch-1 2025-09-07T07:51:36.6867538Z * [new branch] mlazos/S429861-debug -> origin/mlazos/S429861-debug 2025-09-07T07:51:36.6868977Z * [new branch] mlazos/aa -> origin/mlazos/aa 2025-09-07T07:51:36.6870494Z * [new branch] mlazos/arg-renames -> origin/mlazos/arg-renames 2025-09-07T07:51:36.6872075Z * [new branch] mlazos/backup-test-branch -> origin/mlazos/backup-test-branch 2025-09-07T07:51:36.6873685Z * [new branch] mlazos/bad-cudagraphs -> origin/mlazos/bad-cudagraphs 2025-09-07T07:51:36.6875334Z * [new branch] mlazos/baseline -> origin/mlazos/baseline 2025-09-07T07:51:36.6877194Z * [new branch] mlazos/baseline-graph-breaks -> origin/mlazos/baseline-graph-breaks 2025-09-07T07:51:36.6878633Z * [new branch] mlazos/beta-tensor -> origin/mlazos/beta-tensor 2025-09-07T07:51:36.6880303Z * [new branch] mlazos/better-msg -> origin/mlazos/better-msg 2025-09-07T07:51:36.6881635Z * [new branch] mlazos/buffers -> origin/mlazos/buffers 2025-09-07T07:51:36.6883022Z * [new branch] mlazos/buffers2 -> origin/mlazos/buffers2 2025-09-07T07:51:36.6884597Z * [new branch] mlazos/buffers3 -> origin/mlazos/buffers3 2025-09-07T07:51:36.6886793Z * [new branch] mlazos/ck2 -> origin/mlazos/ck2 2025-09-07T07:51:36.6888369Z * [new branch] mlazos/combokernels -> origin/mlazos/combokernels 2025-09-07T07:51:36.6889899Z * [new branch] mlazos/ctx-cleanup -> origin/mlazos/ctx-cleanup 2025-09-07T07:51:36.6891375Z * [new branch] mlazos/cuda-cmd-log -> origin/mlazos/cuda-cmd-log 2025-09-07T07:51:36.6892926Z * [new branch] mlazos/cudagraph-tests -> origin/mlazos/cudagraph-tests 2025-09-07T07:51:36.6894532Z * [new branch] mlazos/cudagraphs-measurement -> origin/mlazos/cudagraphs-measurement 2025-09-07T07:51:36.6896309Z * [new branch] mlazos/cutlass-test -> origin/mlazos/cutlass-test 2025-09-07T07:51:36.6897921Z * [new branch] mlazos/cutlass-topo-bug -> origin/mlazos/cutlass-topo-bug 2025-09-07T07:51:36.6899423Z * [new branch] mlazos/data-gather -> origin/mlazos/data-gather 2025-09-07T07:51:36.6900948Z * [new branch] mlazos/data-ptrs2 -> origin/mlazos/data-ptrs2 2025-09-07T07:51:36.6902798Z * [new branch] mlazos/data-ptrs3 -> origin/mlazos/data-ptrs3 2025-09-07T07:51:36.6904369Z * [new branch] mlazos/dataclass-proxy -> origin/mlazos/dataclass-proxy 2025-09-07T07:51:36.6906179Z * [new branch] mlazos/dc-attrs -> origin/mlazos/dc-attrs 2025-09-07T07:51:36.6907722Z * [new branch] mlazos/dc-helion -> origin/mlazos/dc-helion 2025-09-07T07:51:36.6909165Z * [new branch] mlazos/dict-fix -> origin/mlazos/dict-fix 2025-09-07T07:51:36.6910854Z * [new branch] mlazos/disable-closures -> origin/mlazos/disable-closures 2025-09-07T07:51:36.6912378Z * [new branch] mlazos/disable-tf -> origin/mlazos/disable-tf 2025-09-07T07:51:36.6913819Z * [new branch] mlazos/dupe-fix -> origin/mlazos/dupe-fix 2025-09-07T07:51:36.6915604Z * [new branch] mlazos/dyn-batch -> origin/mlazos/dyn-batch 2025-09-07T07:51:36.6917402Z * [new branch] mlazos/evt -> origin/mlazos/evt 2025-09-07T07:51:36.6919083Z * [new branch] mlazos/exp_disable -> origin/mlazos/exp_disable 2025-09-07T07:51:36.6920670Z * [new branch] mlazos/extract-examples -> origin/mlazos/extract-examples 2025-09-07T07:51:36.6922158Z * [new branch] mlazos/foreach-op -> origin/mlazos/foreach-op 2025-09-07T07:51:36.6923781Z * [new branch] mlazos/fp8 -> origin/mlazos/fp8 2025-09-07T07:51:36.6925388Z * [new branch] mlazos/fp8-bias -> origin/mlazos/fp8-bias 2025-09-07T07:51:36.6927183Z * [new branch] mlazos/fp8-bias-fusion -> origin/mlazos/fp8-bias-fusion 2025-09-07T07:51:36.6928720Z * [new branch] mlazos/fp8-fixes -> origin/mlazos/fp8-fixes 2025-09-07T07:51:36.6930241Z * [new branch] mlazos/freezing -> origin/mlazos/freezing 2025-09-07T07:51:36.6931804Z * [new branch] mlazos/h-comp -> origin/mlazos/h-comp 2025-09-07T07:51:36.6933437Z * [new branch] mlazos/h-comp2 -> origin/mlazos/h-comp2 2025-09-07T07:51:36.6935148Z * [new branch] mlazos/hash-hop -> origin/mlazos/hash-hop 2025-09-07T07:51:36.6936951Z * [new branch] mlazos/hc -> origin/mlazos/hc 2025-09-07T07:51:36.6938865Z * [new branch] mlazos/hc-cycles -> origin/mlazos/hc-cycles 2025-09-07T07:51:36.6940320Z * [new branch] mlazos/hc-fixes -> origin/mlazos/hc-fixes 2025-09-07T07:51:36.6942220Z * [new branch] mlazos/hc-fixes3 -> origin/mlazos/hc-fixes3 2025-09-07T07:51:36.6943900Z * [new branch] mlazos/hc-fixes4 -> origin/mlazos/hc-fixes4 2025-09-07T07:51:36.6945772Z * [new branch] mlazos/hc-hf -> origin/mlazos/hc-hf 2025-09-07T07:51:36.6947364Z * [new branch] mlazos/hc-mut -> origin/mlazos/hc-mut 2025-09-07T07:51:36.6949019Z * [new branch] mlazos/hc10 -> origin/mlazos/hc10 2025-09-07T07:51:36.6950700Z * [new branch] mlazos/hc11 -> origin/mlazos/hc11 2025-09-07T07:51:36.6952318Z * [new branch] mlazos/hc12 -> origin/mlazos/hc12 2025-09-07T07:51:36.6953948Z * [new branch] mlazos/hc13 -> origin/mlazos/hc13 2025-09-07T07:51:36.6955814Z * [new branch] mlazos/hc14 -> origin/mlazos/hc14 2025-09-07T07:51:36.6957336Z * [new branch] mlazos/hc15 -> origin/mlazos/hc15 2025-09-07T07:51:36.6959294Z * [new branch] mlazos/hc2 -> origin/mlazos/hc2 2025-09-07T07:51:36.6960900Z * [new branch] mlazos/hc4 -> origin/mlazos/hc4 2025-09-07T07:51:36.6962564Z * [new branch] mlazos/hc5 -> origin/mlazos/hc5 2025-09-07T07:51:36.6964187Z * [new branch] mlazos/hc6 -> origin/mlazos/hc6 2025-09-07T07:51:36.6966127Z * [new branch] mlazos/hc7 -> origin/mlazos/hc7 2025-09-07T07:51:36.6967709Z * [new branch] mlazos/hc8 -> origin/mlazos/hc8 2025-09-07T07:51:36.6969366Z * [new branch] mlazos/hc9 -> origin/mlazos/hc9 2025-09-07T07:51:36.6971048Z * [new branch] mlazos/hc_baseline2 -> origin/mlazos/hc_baseline2 2025-09-07T07:51:36.6972689Z * [new branch] mlazos/init-per-param -> origin/mlazos/init-per-param 2025-09-07T07:51:36.6974328Z * [new branch] mlazos/init_per_param -> origin/mlazos/init_per_param 2025-09-07T07:51:36.6976253Z * [new branch] mlazos/less-guards -> origin/mlazos/less-guards 2025-09-07T07:51:36.6978042Z * [new branch] mlazos/lr-composibility -> origin/mlazos/lr-composibility 2025-09-07T07:51:36.6979762Z * [new branch] mlazos/main -> origin/mlazos/main 2025-09-07T07:51:36.6981511Z * [new branch] mlazos/main-test-enablement -> origin/mlazos/main-test-enablement 2025-09-07T07:51:36.6983280Z * [new branch] mlazos/main2 -> origin/mlazos/main2 2025-09-07T07:51:36.6985192Z * [new branch] mlazos/mark-static-update -> origin/mlazos/mark-static-update 2025-09-07T07:51:36.6986891Z * [new branch] mlazos/mcg -> origin/mlazos/mcg 2025-09-07T07:51:36.6988624Z * [new branch] mlazos/mcg2 -> origin/mlazos/mcg2 2025-09-07T07:51:36.6990285Z * [new branch] mlazos/meta-guards -> origin/mlazos/meta-guards 2025-09-07T07:51:36.6992447Z * [new branch] mlazos/mlazos/ck2 -> origin/mlazos/mlazos/ck2 2025-09-07T07:51:36.6994089Z * [new branch] mlazos/mlazos/foreach-map-adam -> origin/mlazos/mlazos/foreach-map-adam 2025-09-07T07:51:36.6995984Z * [new branch] mlazos/mlazos/tf-mode-backup -> origin/mlazos/mlazos/tf-mode-backup 2025-09-07T07:51:36.6997885Z * [new branch] mlazos/mod-fix -> origin/mlazos/mod-fix 2025-09-07T07:51:36.6999810Z * [new branch] mlazos/mode-fix -> origin/mlazos/mode-fix 2025-09-07T07:51:36.7001708Z * [new branch] mlazos/more-tests -> origin/mlazos/more-tests 2025-09-07T07:51:36.7003191Z * [new branch] mlazos/no-cpp -> origin/mlazos/no-cpp 2025-09-07T07:51:36.7005178Z * [new branch] mlazos/no-init-group-handling -> origin/mlazos/no-init-group-handling 2025-09-07T07:51:36.7006917Z * [new branch] mlazos/offsets -> origin/mlazos/offsets 2025-09-07T07:51:36.7008680Z * [new branch] mlazos/opt-bench-exp2 -> origin/mlazos/opt-bench-exp2 2025-09-07T07:51:36.7010383Z * [new branch] mlazos/opt-incr -> origin/mlazos/opt-incr 2025-09-07T07:51:36.7012071Z * [new branch] mlazos/proxy-ctors -> origin/mlazos/proxy-ctors 2025-09-07T07:51:36.7013739Z * [new branch] mlazos/quant-fix -> origin/mlazos/quant-fix 2025-09-07T07:51:36.7015740Z * [new branch] mlazos/resnet-fix -> origin/mlazos/resnet-fix 2025-09-07T07:51:36.7017552Z * [new branch] mlazos/revert-inline -> origin/mlazos/revert-inline 2025-09-07T07:51:36.7019456Z * [new branch] mlazos/rm-buf-names -> origin/mlazos/rm-buf-names 2025-09-07T07:51:36.7021197Z * [new branch] mlazos/rm-code -> origin/mlazos/rm-code 2025-09-07T07:51:36.7023090Z * [new branch] mlazos/rm-spam -> origin/mlazos/rm-spam 2025-09-07T07:51:36.7024887Z * [new branch] mlazos/rtp -> origin/mlazos/rtp 2025-09-07T07:51:36.7027059Z * [new branch] mlazos/static-idx-dbg -> origin/mlazos/static-idx-dbg 2025-09-07T07:51:36.7028805Z * [new branch] mlazos/static-inputs-log -> origin/mlazos/static-inputs-log 2025-09-07T07:51:36.7030718Z * [new branch] mlazos/sub-param-fix -> origin/mlazos/sub-param-fix 2025-09-07T07:51:36.7032477Z * [new branch] mlazos/td-fix2 -> origin/mlazos/td-fix2 2025-09-07T07:51:36.7034289Z * [new branch] mlazos/tensor-hasattr2 -> origin/mlazos/tensor-hasattr2 2025-09-07T07:51:36.7036296Z * [new branch] mlazos/test -> origin/mlazos/test 2025-09-07T07:51:36.7038078Z * [new branch] mlazos/tf-mode -> origin/mlazos/tf-mode 2025-09-07T07:51:36.7040189Z * [new branch] mlazos/tf-mode-backup2 -> origin/mlazos/tf-mode-backup2 2025-09-07T07:51:36.7041939Z * [new branch] mlazos/tf-mode-reland -> origin/mlazos/tf-mode-reland 2025-09-07T07:51:36.7043726Z * [new branch] mlazos/tf-mode-reland2 -> origin/mlazos/tf-mode-reland2 2025-09-07T07:51:36.7045723Z * [new branch] mlazos/tf-mode-reland3 -> origin/mlazos/tf-mode-reland3 2025-09-07T07:51:36.7047518Z * [new branch] mlazos/topo-fix -> origin/mlazos/topo-fix 2025-09-07T07:51:36.7049262Z * [new branch] mlazos/triton-no-epi -> origin/mlazos/triton-no-epi 2025-09-07T07:51:36.7050982Z * [new branch] mlazos/tune-proto -> origin/mlazos/tune-proto 2025-09-07T07:51:36.7052720Z * [new branch] mlazos/tuple-fixes -> origin/mlazos/tuple-fixes 2025-09-07T07:51:36.7054366Z * [new branch] mlazos/tuple-fixes2 -> origin/mlazos/tuple-fixes2 2025-09-07T07:51:36.7056498Z * [new branch] mlazos/tuple-handling -> origin/mlazos/tuple-handling 2025-09-07T07:51:36.7058313Z * [new branch] mlazos/user-streams -> origin/mlazos/user-streams 2025-09-07T07:51:36.7060090Z * [new branch] mlazos/vary-beta -> origin/mlazos/vary-beta 2025-09-07T07:51:36.7061935Z * [new branch] mlazos/vary-beta2 -> origin/mlazos/vary-beta2 2025-09-07T07:51:36.7063792Z * [new branch] mlazos/weird-perf1 -> origin/mlazos/weird-perf1 2025-09-07T07:51:36.7065903Z * [new branch] mm_out_dtype_compile -> origin/mm_out_dtype_compile 2025-09-07T07:51:36.7067964Z * [new branch] modify-setupvllm -> origin/modify-setupvllm 2025-09-07T07:51:36.7069933Z * [new branch] module-shim -> origin/module-shim 2025-09-07T07:51:36.7071867Z * [new branch] move-theme-out-docker -> origin/move-theme-out-docker 2025-09-07T07:51:36.7074188Z * [new branch] msaroufim/be1 -> origin/msaroufim/be1 2025-09-07T07:51:36.7076233Z * [new branch] msaroufim/cn_path -> origin/msaroufim/cn_path 2025-09-07T07:51:36.7077915Z * [new branch] msaroufim/dtensorfusedadam -> origin/msaroufim/dtensorfusedadam 2025-09-07T07:51:36.7079695Z * [new branch] msaroufim/reduce -> origin/msaroufim/reduce 2025-09-07T07:51:36.7082099Z * [new branch] mtia/basic-cmake -> origin/mtia/basic-cmake 2025-09-07T07:51:36.7083950Z * [new branch] muon_dev -> origin/muon_dev 2025-09-07T07:51:36.7086072Z * [new branch] muon_dev_1 -> origin/muon_dev_1 2025-09-07T07:51:36.7088054Z * [new branch] nativert_num_outputs -> origin/nativert_num_outputs 2025-09-07T07:51:36.7090320Z * [new branch] nativert_numoutputs -> origin/nativert_numoutputs 2025-09-07T07:51:36.7092130Z * [new branch] new-modifiy-setupvllm -> origin/new-modifiy-setupvllm 2025-09-07T07:51:36.7093864Z * [new branch] new-setupvllm -> origin/new-setupvllm 2025-09-07T07:51:36.7095939Z * [new branch] new_zeros_dtype -> origin/new_zeros_dtype 2025-09-07T07:51:36.7097878Z * [new branch] newtest-base -> origin/newtest-base 2025-09-07T07:51:36.7100238Z * [new branch] ngimel/cat_perf1 -> origin/ngimel/cat_perf1 2025-09-07T07:51:36.7101860Z * [new branch] ngimel/einsum_fix -> origin/ngimel/einsum_fix 2025-09-07T07:51:36.7103360Z * [new branch] ngimel/error_index_list -> origin/ngimel/error_index_list 2025-09-07T07:51:36.7104891Z * [new branch] ngimel/fabric_check -> origin/ngimel/fabric_check 2025-09-07T07:51:36.7106794Z * [new branch] ngimel/fabric_fix -> origin/ngimel/fabric_fix 2025-09-07T07:51:36.7108298Z * [new branch] ngimel/fix_driver_init_error -> origin/ngimel/fix_driver_init_error 2025-09-07T07:51:36.7109786Z * [new branch] ngimel/fix_nccl_segment_seg -> origin/ngimel/fix_nccl_segment_seg 2025-09-07T07:51:36.7111208Z * [new branch] ngimel/gg_new -> origin/ngimel/gg_new 2025-09-07T07:51:36.7112736Z * [new branch] ngimel/modeguard -> origin/ngimel/modeguard 2025-09-07T07:51:36.7114237Z * [new branch] ngimel/multicast_fix -> origin/ngimel/multicast_fix 2025-09-07T07:51:36.7116033Z * [new branch] ngimel/rocm_handle_type -> origin/ngimel/rocm_handle_type 2025-09-07T07:51:36.7117757Z * [new branch] ngimel/symm_handle_fabric -> origin/ngimel/symm_handle_fabric 2025-09-07T07:51:36.7119483Z * [new branch] ngimel/unbind_multimem -> origin/ngimel/unbind_multimem 2025-09-07T07:51:36.7121301Z * [new branch] nightly -> origin/nightly 2025-09-07T07:51:36.7123461Z * [new branch] nmacchioni-patch-10 -> origin/nmacchioni-patch-10 2025-09-07T07:51:36.7125499Z * [new branch] nmacchioni-patch-7 -> origin/nmacchioni-patch-7 2025-09-07T07:51:36.7127642Z * [new branch] nmacchioni-patch-8 -> origin/nmacchioni-patch-8 2025-09-07T07:51:36.7129592Z * [new branch] nmacchioni-patch-9 -> origin/nmacchioni-patch-9 2025-09-07T07:51:36.7131904Z * [new branch] nullplay/fuse_matmul -> origin/nullplay/fuse_matmul 2025-09-07T07:51:36.7133867Z * [new branch] nullplay_fuse_matmul -> origin/nullplay_fuse_matmul 2025-09-07T07:51:36.7135849Z * [new branch] one-off -> origin/one-off 2025-09-07T07:51:36.7138924Z * [new branch] orig/release/1.10 -> origin/orig/release/1.10 2025-09-07T07:51:36.7140518Z * [new branch] orig/release/1.11 -> origin/orig/release/1.11 2025-09-07T07:51:36.7142366Z * [new branch] orig/release/1.12 -> origin/orig/release/1.12 2025-09-07T07:51:36.7144103Z * [new branch] orig/release/1.13 -> origin/orig/release/1.13 2025-09-07T07:51:36.7146030Z * [new branch] orig/release/1.6 -> origin/orig/release/1.6 2025-09-07T07:51:36.7147814Z * [new branch] orig/release/1.7 -> origin/orig/release/1.7 2025-09-07T07:51:36.7149743Z * [new branch] orig/release/1.8 -> origin/orig/release/1.8 2025-09-07T07:51:36.7151508Z * [new branch] orig/release/1.9 -> origin/orig/release/1.9 2025-09-07T07:51:36.7153144Z * [new branch] orig/release/2.0 -> origin/orig/release/2.0 2025-09-07T07:51:36.7154772Z * [new branch] orig/release/2.1 -> origin/orig/release/2.1 2025-09-07T07:51:36.7156748Z * [new branch] orig/release/2.2 -> origin/orig/release/2.2 2025-09-07T07:51:36.7158393Z * [new branch] orig/release/2.3 -> origin/orig/release/2.3 2025-09-07T07:51:36.7160168Z * [new branch] orig/release/2.4 -> origin/orig/release/2.4 2025-09-07T07:51:36.7161751Z * [new branch] orig/release/2.5 -> origin/orig/release/2.5 2025-09-07T07:51:36.7163269Z * [new branch] orig/release/2.6 -> origin/orig/release/2.6 2025-09-07T07:51:36.7164842Z * [new branch] orig/release/2.7 -> origin/orig/release/2.7 2025-09-07T07:51:36.7166727Z * [new branch] orig/release/2.8 -> origin/orig/release/2.8 2025-09-07T07:51:36.7169094Z * [new branch] oulgen/fx_graph -> origin/oulgen/fx_graph 2025-09-07T07:51:36.7171037Z * [new branch] padded-tensor -> origin/padded-tensor 2025-09-07T07:51:36.7172958Z * [new branch] pca2 -> origin/pca2 2025-09-07T07:51:36.7175112Z * [new branch] pianpwk-patch-1 -> origin/pianpwk-patch-1 2025-09-07T07:51:36.7177807Z * [new branch] pianpwk/backed_size_oblivious_export -> origin/pianpwk/backed_size_oblivious_export 2025-09-07T07:51:36.7179543Z * [new branch] pianpwk/invalidate_fake_memo -> origin/pianpwk/invalidate_fake_memo 2025-09-07T07:51:36.7181012Z * [new branch] pianpwk/max_1_strides -> origin/pianpwk/max_1_strides 2025-09-07T07:51:36.7182655Z * [new branch] pianpwk/maybe_guard_rel -> origin/pianpwk/maybe_guard_rel 2025-09-07T07:51:36.7184142Z * [new branch] pianpwk/nonzero_memo -> origin/pianpwk/nonzero_memo 2025-09-07T07:51:36.7186115Z * [new branch] pianpwk/oblivious_reshape_view_better -> origin/pianpwk/oblivious_reshape_view_better 2025-09-07T07:51:36.7187596Z * [new branch] pianpwk/oblivious_slice_forward -> origin/pianpwk/oblivious_slice_forward 2025-09-07T07:51:36.7189040Z * [new branch] pianpwk/oblivious_where -> origin/pianpwk/oblivious_where 2025-09-07T07:51:36.7190595Z * [new branch] pianpwk/param_static_pgo -> origin/pianpwk/param_static_pgo 2025-09-07T07:51:36.7192077Z * [new branch] pianpwk/pre_forward_hook -> origin/pianpwk/pre_forward_hook 2025-09-07T07:51:36.7193785Z * [new branch] pianpwk/remove_guard_fail_break -> origin/pianpwk/remove_guard_fail_break 2025-09-07T07:51:36.7195364Z * [new branch] pianpwk/slice_fresh_symbols -> origin/pianpwk/slice_fresh_symbols 2025-09-07T07:51:36.7197118Z * [new branch] pianpwk/sym_tokens_draft -> origin/pianpwk/sym_tokens_draft 2025-09-07T07:51:36.7198554Z * [new branch] pianpwk/test_pointwise_guard_or_false -> origin/pianpwk/test_pointwise_guard_or_false 2025-09-07T07:51:36.7200040Z * [new branch] pianpwk/test_slice_fake_impl -> origin/pianpwk/test_slice_fake_impl 2025-09-07T07:51:36.7201605Z * [new branch] pianpwk/totally_draft_sym_wrap -> origin/pianpwk/totally_draft_sym_wrap 2025-09-07T07:51:36.7203083Z * [new branch] pianpwk/unbacked_channels_last -> origin/pianpwk/unbacked_channels_last 2025-09-07T07:51:36.7204671Z * [new branch] pianpwk/unbacked_safe_conv1d -> origin/pianpwk/unbacked_safe_conv1d 2025-09-07T07:51:36.7206420Z * [new branch] pianpwk/unbacked_sdpa_flash -> origin/pianpwk/unbacked_sdpa_flash 2025-09-07T07:51:36.7209360Z * [new branch] pianpwk/unbacked_should_swap -> origin/pianpwk/unbacked_should_swap 2025-09-07T07:51:36.7210373Z * [new branch] pianpwk/unbacked_should_swap_2 -> origin/pianpwk/unbacked_should_swap_2 2025-09-07T07:51:36.7211428Z * [new branch] pianpwk/unbacked_slice_binding -> origin/pianpwk/unbacked_slice_binding 2025-09-07T07:51:36.7212962Z * [new branch] pianpwk/unbacked_slice_forward -> origin/pianpwk/unbacked_slice_forward 2025-09-07T07:51:36.7214513Z * [new branch] pianpwk/user_symints -> origin/pianpwk/user_symints 2025-09-07T07:51:36.7216386Z * [new branch] pianpwk/wan21_reshape -> origin/pianpwk/wan21_reshape 2025-09-07T07:51:36.7217916Z * [new branch] pianpwk/whitelist_optimizer -> origin/pianpwk/whitelist_optimizer 2025-09-07T07:51:36.7220016Z * [new branch] pin-torchao -> origin/pin-torchao 2025-09-07T07:51:36.7222668Z * [new branch] piz/fall_back_missing_0716 -> origin/piz/fall_back_missing_0716 2025-09-07T07:51:36.7224108Z * [new branch] piz/improve_scatter_0808 -> origin/piz/improve_scatter_0808 2025-09-07T07:51:36.7226271Z * [new branch] pool-separate -> origin/pool-separate 2025-09-07T07:51:36.7228218Z * [new branch] pr-156087 -> origin/pr-156087 2025-09-07T07:51:36.7230862Z * [new branch] pr/131860 -> origin/pr/131860 2025-09-07T07:51:36.7232807Z * [new branch] predispatch_to -> origin/predispatch_to 2025-09-07T07:51:36.7234670Z * [new branch] pt-opt-cuda3 -> origin/pt-opt-cuda3 2025-09-07T07:51:36.7236965Z * [new branch] pyobjectslot -> origin/pyobjectslot 2025-09-07T07:51:36.7239693Z * [new branch] python_compiled_autograd -> origin/python_compiled_autograd 2025-09-07T07:51:36.7242326Z * [new branch] qchip/export-D54134695 -> origin/qchip/export-D54134695 2025-09-07T07:51:36.7244200Z * [new branch] quint-bits -> origin/quint-bits 2025-09-07T07:51:36.7247065Z * [new branch] release/1.10 -> origin/release/1.10 2025-09-07T07:51:36.7248843Z * [new branch] release/1.11 -> origin/release/1.11 2025-09-07T07:51:36.7250586Z * [new branch] release/1.12 -> origin/release/1.12 2025-09-07T07:51:36.7252190Z * [new branch] release/1.13 -> origin/release/1.13 2025-09-07T07:51:36.7253873Z * [new branch] release/1.4 -> origin/release/1.4 2025-09-07T07:51:36.7255556Z * [new branch] release/1.4.1 -> origin/release/1.4.1 2025-09-07T07:51:36.7257150Z * [new branch] release/1.5 -> origin/release/1.5 2025-09-07T07:51:36.7258965Z * [new branch] release/1.6 -> origin/release/1.6 2025-09-07T07:51:36.7260754Z * [new branch] release/1.7 -> origin/release/1.7 2025-09-07T07:51:36.7262700Z * [new branch] release/1.8 -> origin/release/1.8 2025-09-07T07:51:36.7264086Z * [new branch] release/1.9 -> origin/release/1.9 2025-09-07T07:51:36.7266049Z * [new branch] release/2.0 -> origin/release/2.0 2025-09-07T07:51:36.7267700Z * [new branch] release/2.1 -> origin/release/2.1 2025-09-07T07:51:36.7269567Z * [new branch] release/2.2 -> origin/release/2.2 2025-09-07T07:51:36.7271319Z * [new branch] release/2.3 -> origin/release/2.3 2025-09-07T07:51:36.7272996Z * [new branch] release/2.4 -> origin/release/2.4 2025-09-07T07:51:36.7274590Z * [new branch] release/2.5 -> origin/release/2.5 2025-09-07T07:51:36.7276562Z * [new branch] release/2.6 -> origin/release/2.6 2025-09-07T07:51:36.7278238Z * [new branch] release/2.7 -> origin/release/2.7 2025-09-07T07:51:36.7279739Z * [new branch] release/2.8 -> origin/release/2.8 2025-09-07T07:51:36.7281536Z * [new branch] release_notes -> origin/release_notes 2025-09-07T07:51:36.7283413Z * [new branch] remove-actionable-label -> origin/remove-actionable-label 2025-09-07T07:51:36.7285495Z * [new branch] remove-ao -> origin/remove-ao 2025-09-07T07:51:36.7287640Z * [new branch] removedeprecatedvllmtest -> origin/removedeprecatedvllmtest 2025-09-07T07:51:36.7289446Z * [new branch] replace-pytorch-labs-20250812-195836 -> origin/replace-pytorch-labs-20250812-195836 2025-09-07T07:51:36.7291262Z * [new branch] replace-pytorch-labs-20250812-200248 -> origin/replace-pytorch-labs-20250812-200248 2025-09-07T07:51:36.7292946Z * [new branch] replace-pytorch-labs-20250812-200324 -> origin/replace-pytorch-labs-20250812-200324 2025-09-07T07:51:36.7295589Z * [new branch] replace-pytorch-labs-20250812-204020 -> origin/replace-pytorch-labs-20250812-204020 2025-09-07T07:51:36.7297365Z * [new branch] replace-pytorch-labs-20250812-204125 -> origin/replace-pytorch-labs-20250812-204125 2025-09-07T07:51:36.7299168Z * [new branch] replace-pytorch-labs-20250812-205624 -> origin/replace-pytorch-labs-20250812-205624 2025-09-07T07:51:36.7302712Z * [new branch] revert-131069-gh/krzysztofjordan/1/head -> origin/revert-131069-gh/krzysztofjordan/1/head 2025-09-07T07:51:36.7306727Z * [new branch] revert-131469-gh/andrewor14/51/head -> origin/revert-131469-gh/andrewor14/51/head 2025-09-07T07:51:36.7310197Z * [new branch] revert-156870-gh/skarjala/3/head -> origin/revert-156870-gh/skarjala/3/head 2025-09-07T07:51:36.7312270Z * [new branch] revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ -> origin/revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ 2025-09-07T07:51:36.7314356Z * [new branch] rocm-monitoring -> origin/rocm-monitoring 2025-09-07T07:51:36.7317047Z * [new branch] ruisi/relax_memory -> origin/ruisi/relax_memory 2025-09-07T07:51:36.7318972Z * [new branch] run-torchbench-smoke-test-h100 -> origin/run-torchbench-smoke-test-h100 2025-09-07T07:51:36.7321439Z * [new branch] ryanguo99/cleanup-dynamo-expected-failures -> origin/ryanguo99/cleanup-dynamo-expected-failures 2025-09-07T07:51:36.7322931Z * [new branch] ryanguo99/fix-closure-var -> origin/ryanguo99/fix-closure-var 2025-09-07T07:51:36.7325816Z * [new branch] rzou/faketensor_bench -> origin/rzou/faketensor_bench 2025-09-07T07:51:36.7327122Z * [new branch] rzou/njt -> origin/rzou/njt 2025-09-07T07:51:36.7328766Z * [new branch] rzou/pca -> origin/rzou/pca 2025-09-07T07:51:36.7330262Z * [new branch] rzou/realprop -> origin/rzou/realprop 2025-09-07T07:51:36.7331940Z * [new branch] rzou/setup_context -> origin/rzou/setup_context 2025-09-07T07:51:36.7334273Z * [new branch] sanchitintel/refactor_aten_int8_woq_gemm -> origin/sanchitintel/refactor_aten_int8_woq_gemm 2025-09-07T07:51:36.7336193Z * [new branch] sanchitintel/weird_thing_with_test_cpu_select_algorithm -> origin/sanchitintel/weird_thing_with_test_cpu_select_algorithm 2025-09-07T07:51:36.7337740Z * [new branch] sapling-pr-archive-SS-JIA -> origin/sapling-pr-archive-SS-JIA 2025-09-07T07:51:36.7339513Z * [new branch] save -> origin/save 2025-09-07T07:51:36.7342082Z * [new branch] sdym/2.5.1 -> origin/sdym/2.5.1 2025-09-07T07:51:36.7343949Z * [new branch] seemethere-patch-1 -> origin/seemethere-patch-1 2025-09-07T07:51:36.7346081Z * [new branch] setupvllm -> origin/setupvllm 2025-09-07T07:51:36.7347781Z * [new branch] share_and_pin_fork -> origin/share_and_pin_fork 2025-09-07T07:51:36.7350053Z * [new branch] shengf/fx-xform-perf -> origin/shengf/fx-xform-perf 2025-09-07T07:51:36.7351819Z * [new branch] shikaili_fp8_allgather -> origin/shikaili_fp8_allgather 2025-09-07T07:51:36.7353551Z * [new branch] shoumikhin-patch-1 -> origin/shoumikhin-patch-1 2025-09-07T07:51:36.7355523Z * [new branch] shoumikhin-patch-12 -> origin/shoumikhin-patch-12 2025-09-07T07:51:36.7357405Z * [new branch] simplify-fq-per-channel -> origin/simplify-fq-per-channel 2025-09-07T07:51:36.7359059Z * [new branch] solve-accuracy-fix -> origin/solve-accuracy-fix 2025-09-07T07:51:36.7361210Z * [new branch] soulitzer/stash-tls-ac -> origin/soulitzer/stash-tls-ac 2025-09-07T07:51:36.7363622Z * [new branch] sqzhang/flight4 -> origin/sqzhang/flight4 2025-09-07T07:51:36.7365330Z * [new branch] sqzhang/flight4plus -> origin/sqzhang/flight4plus 2025-09-07T07:51:36.7367797Z * [new branch] sraikund/record_funct_test -> origin/sraikund/record_funct_test 2025-09-07T07:51:36.7370258Z * [new branch] sraikund16/test -> origin/sraikund16/test 2025-09-07T07:51:36.7372151Z * [new branch] stablize-compilation-time -> origin/stablize-compilation-time 2025-09-07T07:51:36.7373869Z * [new branch] standalone-templates -> origin/standalone-templates 2025-09-07T07:51:36.7375898Z * [new branch] standalone_package_weights -> origin/standalone_package_weights 2025-09-07T07:51:36.7377641Z * [new branch] starterTaskUpdate -> origin/starterTaskUpdate 2025-09-07T07:51:36.7379537Z * [new branch] subgraph_fuse -> origin/subgraph_fuse 2025-09-07T07:51:36.7381668Z * [new branch] support-uv-in-collect_env -> origin/support-uv-in-collect_env 2025-09-07T07:51:36.7383348Z * [new branch] sve-poc -> origin/sve-poc 2025-09-07T07:51:36.7385448Z * [new branch] svekars-patch-1 -> origin/svekars-patch-1 2025-09-07T07:51:36.7387459Z * [new branch] switch-bn -> origin/switch-bn 2025-09-07T07:51:36.7389280Z * [new branch] sympy-bottleneck-repro -> origin/sympy-bottleneck-repro 2025-09-07T07:51:36.7391589Z * [new branch] tenpercent/ck_rocm_ci_v3 -> origin/tenpercent/ck_rocm_ci_v3 2025-09-07T07:51:36.7393403Z * [new branch] tensordict_integration -> origin/tensordict_integration 2025-09-07T07:51:36.7395284Z * [new branch] test-7054 -> origin/test-7054 2025-09-07T07:51:36.7397219Z * [new branch] test-move-conda-builds -> origin/test-move-conda-builds 2025-09-07T07:51:36.7399270Z * [new branch] test-myst-markdown-docstring -> origin/test-myst-markdown-docstring 2025-09-07T07:51:36.7400759Z * [new branch] test-old -> origin/test-old 2025-09-07T07:51:36.7402503Z * [new branch] test-vec-migration-internally -> origin/test-vec-migration-internally 2025-09-07T07:51:36.7404623Z * [new branch] test/bmm_heur -> origin/test/bmm_heur 2025-09-07T07:51:36.7406478Z * [new branch] test/inductor -> origin/test/inductor 2025-09-07T07:51:36.7408864Z * [new branch] tianren/flex_paged_attn_fix -> origin/tianren/flex_paged_attn_fix 2025-09-07T07:51:36.7410371Z * [new branch] tianren/flex_paged_attn_fix_temp -> origin/tianren/flex_paged_attn_fix_temp 2025-09-07T07:51:36.7411975Z * [new branch] tianren/test -> origin/tianren/test 2025-09-07T07:51:36.7413743Z * [new branch] tidy_performance_cyy -> origin/tidy_performance_cyy 2025-09-07T07:51:36.7415770Z * [new branch] torchtitan_ep -> origin/torchtitan_ep 2025-09-07T07:51:36.7417604Z * [new branch] trace_fsdp_torchtune_lora -> origin/trace_fsdp_torchtune_lora 2025-09-07T07:51:36.7419257Z * [new branch] traceable_fsdp_unit_tests -> origin/traceable_fsdp_unit_tests 2025-09-07T07:51:36.7420920Z * [new branch] tree_loop_vec_base -> origin/tree_loop_vec_base 2025-09-07T07:51:36.7422900Z * [new branch] tree_vec_base -> origin/tree_vec_base 2025-09-07T07:51:36.7424586Z * [new branch] triton-update -> origin/triton-update 2025-09-07T07:51:36.7426514Z * [new branch] triton_kernel -> origin/triton_kernel 2025-09-07T07:51:36.7428164Z * [new branch] triton_kernel_perf -> origin/triton_kernel_perf 2025-09-07T07:51:36.7430298Z * [new branch] tt_pkg_1908 -> origin/tt_pkg_1908 2025-09-07T07:51:36.7432150Z * [new branch] tweak-transformer-dependabot -> origin/tweak-transformer-dependabot 2025-09-07T07:51:36.7433762Z * [new branch] type_dec -> origin/type_dec 2025-09-07T07:51:36.7435932Z * [new branch] udate-sphinx-dependancies -> origin/udate-sphinx-dependancies 2025-09-07T07:51:36.7438421Z * [new branch] update-audio-commit-hash/16818882925-1712-1 -> origin/update-audio-commit-hash/16818882925-1712-1 2025-09-07T07:51:36.7439863Z * [new branch] update-audio-commit-hash/16895560422-1720-1 -> origin/update-audio-commit-hash/16895560422-1720-1 2025-09-07T07:51:36.7441345Z * [new branch] update-audio-commit-hash/16924174496-1738-1 -> origin/update-audio-commit-hash/16924174496-1738-1 2025-09-07T07:51:36.7442910Z * [new branch] update-audio-commit-hash/17002010821-1749-1 -> origin/update-audio-commit-hash/17002010821-1749-1 2025-09-07T07:51:36.7444394Z * [new branch] update-audio-commit-hash/17056004427-1766-1 -> origin/update-audio-commit-hash/17056004427-1766-1 2025-09-07T07:51:36.7446118Z * [new branch] update-audio-commit-hash/17085054029-1767-1 -> origin/update-audio-commit-hash/17085054029-1767-1 2025-09-07T07:51:36.7447635Z * [new branch] update-audio-commit-hash/17142507405-1771-1 -> origin/update-audio-commit-hash/17142507405-1771-1 2025-09-07T07:51:36.7449209Z * [new branch] update-audio-commit-hash/17168762740-1773-1 -> origin/update-audio-commit-hash/17168762740-1773-1 2025-09-07T07:51:36.7450621Z * [new branch] update-audio-commit-hash/17311174639-1780-1 -> origin/update-audio-commit-hash/17311174639-1780-1 2025-09-07T07:51:36.7452169Z * [new branch] update-audio-commit-hash/17336898740-1781-1 -> origin/update-audio-commit-hash/17336898740-1781-1 2025-09-07T07:51:36.7453692Z * [new branch] update-audio-commit-hash/17389727684-1786-1 -> origin/update-audio-commit-hash/17389727684-1786-1 2025-09-07T07:51:36.7455498Z * [new branch] update-audio-commit-hash/17449538142-1790-1 -> origin/update-audio-commit-hash/17449538142-1790-1 2025-09-07T07:51:36.7457077Z * [new branch] update-audio-commit-hash/17507351808-1794-1 -> origin/update-audio-commit-hash/17507351808-1794-1 2025-09-07T07:51:36.7458739Z * [new branch] update-dynamic-shapes-doc -> origin/update-dynamic-shapes-doc 2025-09-07T07:51:36.7461194Z * [new branch] update-executorch-commit-hash/15694981040-1626-1 -> origin/update-executorch-commit-hash/15694981040-1626-1 2025-09-07T07:51:36.7463609Z * [new branch] update-triton-commit-hash/13663274526-1487-2 -> origin/update-triton-commit-hash/13663274526-1487-2 2025-09-07T07:51:36.7466323Z * [new branch] update-vision-commit-hash/15336342773-1607-1 -> origin/update-vision-commit-hash/15336342773-1607-1 2025-09-07T07:51:36.7468635Z * [new branch] update-vllm-commit-hash/16737365217-1704-1 -> origin/update-vllm-commit-hash/16737365217-1704-1 2025-09-07T07:51:36.7470577Z * [new branch] update-vllm-commit-hash/16843157111-1713-1 -> origin/update-vllm-commit-hash/16843157111-1713-1 2025-09-07T07:51:36.7472078Z * [new branch] update-vllm-commit-hash/16855312394-1714-1 -> origin/update-vllm-commit-hash/16855312394-1714-1 2025-09-07T07:51:36.7473553Z * [new branch] update-vllm-commit-hash/16924174496-1738-1 -> origin/update-vllm-commit-hash/16924174496-1738-1 2025-09-07T07:51:36.7475238Z * [new branch] update-vllm-commit-hash/16952608705-1745-1 -> origin/update-vllm-commit-hash/16952608705-1745-1 2025-09-07T07:51:36.7476811Z * [new branch] update-vllm-commit-hash/16979836546-1748-1 -> origin/update-vllm-commit-hash/16979836546-1748-1 2025-09-07T07:51:36.7478242Z * [new branch] update-vllm-commit-hash/17014576881-1756-1 -> origin/update-vllm-commit-hash/17014576881-1756-1 2025-09-07T07:51:36.7479779Z * [new branch] update-vllm-commit-hash/17027830869-1761-1 -> origin/update-vllm-commit-hash/17027830869-1761-1 2025-09-07T07:51:36.7481258Z * [new branch] update-vllm-commit-hash/17056004427-1766-1 -> origin/update-vllm-commit-hash/17056004427-1766-1 2025-09-07T07:51:36.7482808Z * [new branch] update-vllm-commit-hash/17085054029-1767-1 -> origin/update-vllm-commit-hash/17085054029-1767-1 2025-09-07T07:51:36.7484325Z * [new branch] update-vllm-commit-hash/17113610216-1768-1 -> origin/update-vllm-commit-hash/17113610216-1768-1 2025-09-07T07:51:36.7486161Z * [new branch] update-vllm-commit-hash/17142507405-1771-1 -> origin/update-vllm-commit-hash/17142507405-1771-1 2025-09-07T07:51:36.7487742Z * [new branch] update-vllm-commit-hash/17181878974-1774-1 -> origin/update-vllm-commit-hash/17181878974-1774-1 2025-09-07T07:51:36.7489480Z * [new branch] update-vllm-commit-hash/17311174639-1780-1 -> origin/update-vllm-commit-hash/17311174639-1780-1 2025-09-07T07:51:36.7490966Z * [new branch] update-vllm-commit-hash/17336898740-1781-1 -> origin/update-vllm-commit-hash/17336898740-1781-1 2025-09-07T07:51:36.7492535Z * [new branch] update-vllm-commit-hash/17364352302-1785-1 -> origin/update-vllm-commit-hash/17364352302-1785-1 2025-09-07T07:51:36.7494064Z * [new branch] update-vllm-commit-hash/17389727684-1786-1 -> origin/update-vllm-commit-hash/17389727684-1786-1 2025-09-07T07:51:36.7495893Z * [new branch] update-vllm-commit-hash/17449538142-1790-1 -> origin/update-vllm-commit-hash/17449538142-1790-1 2025-09-07T07:51:36.7497412Z * [new branch] update-vllm-commit-hash/17480069797-1791-1 -> origin/update-vllm-commit-hash/17480069797-1791-1 2025-09-07T07:51:36.7499065Z * [new branch] update-vllm-commit-hash/17507351808-1794-1 -> origin/update-vllm-commit-hash/17507351808-1794-1 2025-09-07T07:51:36.7501367Z * [new branch] update-xla-commit-hash/16873912760-198-1 -> origin/update-xla-commit-hash/16873912760-198-1 2025-09-07T07:51:36.7503226Z * [new branch] update-xla-commit-hash/17034266655-199-1 -> origin/update-xla-commit-hash/17034266655-199-1 2025-09-07T07:51:36.7504602Z * [new branch] update-xla-commit-hash/17202464405-200-1 -> origin/update-xla-commit-hash/17202464405-200-1 2025-09-07T07:51:36.7506834Z * [new branch] update_docs_torch_multinomial_issue#125388 -> origin/update_docs_torch_multinomial_issue#125388 2025-09-07T07:51:36.7508561Z * [new branch] update_executorch_pin -> origin/update_executorch_pin 2025-09-07T07:51:36.7510560Z * [new branch] update_slow_tests_1722488736 -> origin/update_slow_tests_1722488736 2025-09-07T07:51:36.7512306Z * [new branch] update_slow_tests_1722879173 -> origin/update_slow_tests_1722879173 2025-09-07T07:51:36.7514139Z * [new branch] update_slow_tests_1752478971 -> origin/update_slow_tests_1752478971 2025-09-07T07:51:36.7516227Z * [new branch] update_slow_tests_1755502951 -> origin/update_slow_tests_1755502951 2025-09-07T07:51:36.7517971Z * [new branch] update_slow_tests_1756107664 -> origin/update_slow_tests_1756107664 2025-09-07T07:51:36.7519753Z * [new branch] update_submodule_FBGEMM -> origin/update_submodule_FBGEMM 2025-09-07T07:51:36.7521534Z * [new branch] update_submodule_kineto -> origin/update_submodule_kineto 2025-09-07T07:51:36.7523341Z * [new branch] update_submodule_tensorpipe -> origin/update_submodule_tensorpipe 2025-09-07T07:51:36.7525335Z * [new branch] v0.1.2 -> origin/v0.1.2 2025-09-07T07:51:36.7527390Z * [new branch] v1.0.1 -> origin/v1.0.1 2025-09-07T07:51:36.7529426Z * [new branch] v1.0.3 -> origin/v1.0.3 2025-09-07T07:51:36.7531269Z * [new branch] v1.1.0 -> origin/v1.1.0 2025-09-07T07:51:36.7533164Z * [new branch] v1.2.0 -> origin/v1.2.0 2025-09-07T07:51:36.7535086Z * [new branch] v1.3.0 -> origin/v1.3.0 2025-09-07T07:51:36.7537120Z * [new branch] v1.3.1 -> origin/v1.3.1 2025-09-07T07:51:36.7539131Z * [new branch] validate_fn -> origin/validate_fn 2025-09-07T07:51:36.7541181Z * [new branch] validations_2.6 -> origin/validations_2.6 2025-09-07T07:51:36.7543109Z * [new branch] validations_2.8 -> origin/validations_2.8 2025-09-07T07:51:36.7545799Z * [new branch] viable/strict -> origin/viable/strict 2025-09-07T07:51:36.7547614Z * [new branch] vllmbuildci -> origin/vllmbuildci 2025-09-07T07:51:36.7549474Z * [new branch] vllmpin -> origin/vllmpin 2025-09-07T07:51:36.7551901Z * [new branch] wdvr/conda_devcontainer -> origin/wdvr/conda_devcontainer 2025-09-07T07:51:36.7553371Z * [new branch] wdvr/iss_145259 -> origin/wdvr/iss_145259 2025-09-07T07:51:36.7555309Z * [new branch] weight_sharing_cpp -> origin/weight_sharing_cpp 2025-09-07T07:51:36.7557995Z * [new branch] whc/flight4 -> origin/whc/flight4 2025-09-07T07:51:36.7559725Z * [new branch] whc/flight51 -> origin/whc/flight51 2025-09-07T07:51:36.7561321Z * [new branch] whc/flight53 -> origin/whc/flight53 2025-09-07T07:51:36.7562957Z * [new branch] whc/stage2 -> origin/whc/stage2 2025-09-07T07:51:36.7564431Z * [new branch] whc/uneven -> origin/whc/uneven 2025-09-07T07:51:36.7566665Z * [new branch] whc/uneven-merge -> origin/whc/uneven-merge 2025-09-07T07:51:36.7568416Z * [new branch] win_warnings -> origin/win_warnings 2025-09-07T07:51:36.7570393Z * [new branch] windows_libtorch_free -> origin/windows_libtorch_free 2025-09-07T07:51:36.7571880Z * [new branch] workonoldcommit -> origin/workonoldcommit 2025-09-07T07:51:36.7573901Z * [new branch] wychi-autotune-prune-configs-by-shared-mem -> origin/wychi-autotune-prune-configs-by-shared-mem 2025-09-07T07:51:36.7576306Z * [new branch] xmfan/ca_0516 -> origin/xmfan/ca_0516 2025-09-07T07:51:36.7577918Z * [new branch] xmfan/ca_1051b93192 -> origin/xmfan/ca_1051b93192 2025-09-07T07:51:36.7579634Z * [new branch] xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 -> origin/xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 2025-09-07T07:51:36.7581060Z * [new branch] xmfan/ca_5a2be192d1 -> origin/xmfan/ca_5a2be192d1 2025-09-07T07:51:36.7582722Z * [new branch] xmfan/ca_9d59b516e9 -> origin/xmfan/ca_9d59b516e9 2025-09-07T07:51:36.7584435Z * [new branch] xmfan/ca_api -> origin/xmfan/ca_api 2025-09-07T07:51:36.7586244Z * [new branch] xmfan/ca_apr8 -> origin/xmfan/ca_apr8 2025-09-07T07:51:36.7587873Z * [new branch] xmfan/ca_base -> origin/xmfan/ca_base 2025-09-07T07:51:36.7589395Z * [new branch] xmfan/ca_cudagraphs -> origin/xmfan/ca_cudagraphs 2025-09-07T07:51:36.7590954Z * [new branch] xmfan/ca_dynamic -> origin/xmfan/ca_dynamic 2025-09-07T07:51:36.7592613Z * [new branch] xmfan/ca_fix_dyn -> origin/xmfan/ca_fix_dyn 2025-09-07T07:51:36.7594193Z * [new branch] xmfan/ca_fix_lowering -> origin/xmfan/ca_fix_lowering 2025-09-07T07:51:36.7595913Z * [new branch] xmfan/ca_fix_polyfills -> origin/xmfan/ca_fix_polyfills 2025-09-07T07:51:36.7597330Z * [new branch] xmfan/ca_jan3 -> origin/xmfan/ca_jan3 2025-09-07T07:51:36.7599161Z * [new branch] xmfan/ca_jun18 -> origin/xmfan/ca_jun18 2025-09-07T07:51:36.7600731Z * [new branch] xmfan/ca_jun24 -> origin/xmfan/ca_jun24 2025-09-07T07:51:36.7602231Z * [new branch] xmfan/ca_mem_base -> origin/xmfan/ca_mem_base 2025-09-07T07:51:36.7603754Z * [new branch] xmfan/ca_mem_fix -> origin/xmfan/ca_mem_fix 2025-09-07T07:51:36.7605603Z * [new branch] xmfan/ca_memory_fix -> origin/xmfan/ca_memory_fix 2025-09-07T07:51:36.7607174Z * [new branch] xmfan/ca_memory_fix_rebased -> origin/xmfan/ca_memory_fix_rebased 2025-09-07T07:51:36.7608730Z * [new branch] xmfan/ca_memory_fix_rebased2 -> origin/xmfan/ca_memory_fix_rebased2 2025-09-07T07:51:36.7610227Z * [new branch] xmfan/ca_move_to_cuda -> origin/xmfan/ca_move_to_cuda 2025-09-07T07:51:36.7611754Z * [new branch] xmfan/ca_nested -> origin/xmfan/ca_nested 2025-09-07T07:51:36.7613327Z * [new branch] xmfan/ca_overhead -> origin/xmfan/ca_overhead 2025-09-07T07:51:36.7614924Z * [new branch] xmfan/ca_overhead_0eba7e5451 -> origin/xmfan/ca_overhead_0eba7e5451 2025-09-07T07:51:36.7616712Z * [new branch] xmfan/ca_scalar -> origin/xmfan/ca_scalar 2025-09-07T07:51:36.7618294Z * [new branch] xmfan/ca_subclass_mem_fix -> origin/xmfan/ca_subclass_mem_fix 2025-09-07T07:51:36.7619828Z * [new branch] xmfan/ca_warm_mem -> origin/xmfan/ca_warm_mem 2025-09-07T07:51:36.7621390Z * [new branch] xmfan/ca_warm_mem_base -> origin/xmfan/ca_warm_mem_base 2025-09-07T07:51:36.7623127Z * [new branch] xmfan/cacu_jun18 -> origin/xmfan/cacu_jun18 2025-09-07T07:51:36.7624686Z * [new branch] xmfan/cacu_jun19 -> origin/xmfan/cacu_jun19 2025-09-07T07:51:36.7626536Z * [new branch] xmfan/cacu_jun4 -> origin/xmfan/cacu_jun4 2025-09-07T07:51:36.7628602Z * [new branch] xmfan/cacu_may27 -> origin/xmfan/cacu_may27 2025-09-07T07:51:36.7630303Z * [new branch] xmfan/disable_duck_shape -> origin/xmfan/disable_duck_shape 2025-09-07T07:51:36.7631900Z * [new branch] xmfan/fca_cpp_node_passthrough -> origin/xmfan/fca_cpp_node_passthrough 2025-09-07T07:51:36.7633435Z * [new branch] xmfan/issue_123374 -> origin/xmfan/issue_123374 2025-09-07T07:51:36.7635411Z * [new branch] xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 2025-09-07T07:51:36.7637059Z * [new branch] xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 2025-09-07T07:51:36.7638637Z * [new branch] xmfan/segfault_test -> origin/xmfan/segfault_test 2025-09-07T07:51:36.7640366Z * [new branch] xmfan/single_step -> origin/xmfan/single_step 2025-09-07T07:51:36.7641971Z * [new branch] xmfan/sth_0829 -> origin/xmfan/sth_0829 2025-09-07T07:51:36.7643662Z * [new branch] xmfan/test -> origin/xmfan/test 2025-09-07T07:51:36.7646409Z * [new branch] yguo/debug-0226-constexpr -> origin/yguo/debug-0226-constexpr 2025-09-07T07:51:36.7647946Z * [new branch] yguo/new_latest_changes -> origin/yguo/new_latest_changes 2025-09-07T07:51:36.7649496Z * [new branch] yguo/patch_constexpr_changes -> origin/yguo/patch_constexpr_changes 2025-09-07T07:51:36.7651205Z * [new branch] yihan_quantization -> origin/yihan_quantization 2025-09-07T07:51:36.7653649Z * [new branch] yiming/add_jit_trace_benchmark -> origin/yiming/add_jit_trace_benchmark 2025-09-07T07:51:36.7655267Z * [new branch] yiming/add_nativert_benchmark -> origin/yiming/add_nativert_benchmark 2025-09-07T07:51:36.7656851Z * [new branch] yiming/bootcamp -> origin/yiming/bootcamp 2025-09-07T07:51:36.7659319Z * [new branch] zainr/canary-test -> origin/zainr/canary-test 2025-09-07T07:51:36.7661028Z * [new branch] zainr/cleanup-gh-runners -> origin/zainr/cleanup-gh-runners 2025-09-07T07:51:36.7662679Z * [new branch] zainr/git-push-v2 -> origin/zainr/git-push-v2 2025-09-07T07:51:36.7664215Z * [new branch] zainr/pull-migration-c -> origin/zainr/pull-migration-c 2025-09-07T07:51:36.7666094Z * [new branch] zainr/test -> origin/zainr/test 2025-09-07T07:51:36.7667557Z * [new branch] zainr/test2 -> origin/zainr/test2 2025-09-07T07:51:36.7669100Z * [new branch] zainr/unstable -> origin/zainr/unstable 2025-09-07T07:51:36.7670730Z * [new branch] zainr/unstable-xla -> origin/zainr/unstable-xla 2025-09-07T07:51:36.7672645Z * [new branch] zasdfgbnm-patch-3 -> origin/zasdfgbnm-patch-3 2025-09-07T07:51:36.7674297Z * [new branch] zb2p -> origin/zb2p 2025-09-07T07:51:36.7676509Z * [new branch] zero_grad_optimization -> origin/zero_grad_optimization 2025-09-07T07:51:36.7678477Z * [new branch] zeros-and-scatter-part2 -> origin/zeros-and-scatter-part2 2025-09-07T07:51:36.7681255Z * [new branch] zhxchen17/scratch/0 -> origin/zhxchen17/scratch/0 2025-09-07T07:51:36.7683627Z * [new branch] zhxhcen17/moodycamel -> origin/zhxhcen17/moodycamel 2025-09-07T07:51:36.7686234Z * [new branch] zxiiro/main -> origin/zxiiro/main 2025-09-07T07:51:36.7687676Z * [new tag] bc2caa7fdf006894eff7af936babde69ab5a40f8-huydhn-debug -> bc2caa7fdf006894eff7af936babde69ab5a40f8-huydhn-debug 2025-09-07T07:51:36.7689265Z * [new tag] ci/binaries/77164 -> ci/binaries/77164 2025-09-07T07:51:36.7690827Z * [new tag] ciflow/binaries/156049 -> ciflow/binaries/156049 2025-09-07T07:51:36.7691529Z * [new tag] ciflow/binaries/156712 -> ciflow/binaries/156712 2025-09-07T07:51:36.7692435Z * [new tag] ciflow/binaries/157432 -> ciflow/binaries/157432 2025-09-07T07:51:36.7693328Z * [new tag] ciflow/binaries/157685 -> ciflow/binaries/157685 2025-09-07T07:51:36.7694181Z * [new tag] ciflow/binaries/157689 -> ciflow/binaries/157689 2025-09-07T07:51:36.7695134Z * [new tag] ciflow/binaries/158104 -> ciflow/binaries/158104 2025-09-07T07:51:36.7696301Z * [new tag] ciflow/binaries/160229 -> ciflow/binaries/160229 2025-09-07T07:51:36.7697238Z * [new tag] ciflow/binaries/160720 -> ciflow/binaries/160720 2025-09-07T07:51:36.7698052Z * [new tag] ciflow/binaries/162080 -> ciflow/binaries/162080 2025-09-07T07:51:36.7698952Z * [new tag] ciflow/binaries/162329 -> ciflow/binaries/162329 2025-09-07T07:51:36.7700082Z * [new tag] ciflow/binaries_libtorch/156049 -> ciflow/binaries_libtorch/156049 2025-09-07T07:51:36.7700950Z * [new tag] ciflow/binaries_libtorch/156711 -> ciflow/binaries_libtorch/156711 2025-09-07T07:51:36.7701891Z * [new tag] ciflow/binaries_libtorch/157432 -> ciflow/binaries_libtorch/157432 2025-09-07T07:51:36.7703188Z * [new tag] ciflow/binaries_wheel/156049 -> ciflow/binaries_wheel/156049 2025-09-07T07:51:36.7703846Z * [new tag] ciflow/binaries_wheel/156711 -> ciflow/binaries_wheel/156711 2025-09-07T07:51:36.7704757Z * [new tag] ciflow/binaries_wheel/157432 -> ciflow/binaries_wheel/157432 2025-09-07T07:51:36.7705809Z * [new tag] ciflow/binaries_wheel/162136 -> ciflow/binaries_wheel/162136 2025-09-07T07:51:36.7706808Z * [new tag] ciflow/binaries_wheel/162252 -> ciflow/binaries_wheel/162252 2025-09-07T07:51:36.7707625Z * [new tag] ciflow/binaries_wheel/162325 -> ciflow/binaries_wheel/162325 2025-09-07T07:51:36.7708994Z * [new tag] ciflow/h100-distributed/156703 -> ciflow/h100-distributed/156703 2025-09-07T07:51:36.7710457Z * [new tag] ciflow/h100-symm-mem/157635 -> ciflow/h100-symm-mem/157635 2025-09-07T07:51:36.7711349Z * [new tag] ciflow/h100-symm-mem/161984 -> ciflow/h100-symm-mem/161984 2025-09-07T07:51:36.7712211Z * [new tag] ciflow/h100-symm-mem/162003 -> ciflow/h100-symm-mem/162003 2025-09-07T07:51:36.7713115Z * [new tag] ciflow/h100-symm-mem/162011 -> ciflow/h100-symm-mem/162011 2025-09-07T07:51:36.7713950Z * [new tag] ciflow/h100-symm-mem/162026 -> ciflow/h100-symm-mem/162026 2025-09-07T07:51:36.7714855Z * [new tag] ciflow/h100-symm-mem/162033 -> ciflow/h100-symm-mem/162033 2025-09-07T07:51:36.7715969Z * [new tag] ciflow/h100-symm-mem/162040 -> ciflow/h100-symm-mem/162040 2025-09-07T07:51:36.7716879Z * [new tag] ciflow/h100-symm-mem/162041 -> ciflow/h100-symm-mem/162041 2025-09-07T07:51:36.7717734Z * [new tag] ciflow/h100-symm-mem/162142 -> ciflow/h100-symm-mem/162142 2025-09-07T07:51:36.7718831Z * [new tag] ciflow/h100-symm-mem/162150 -> ciflow/h100-symm-mem/162150 2025-09-07T07:51:36.7719646Z * [new tag] ciflow/h100-symm-mem/162243 -> ciflow/h100-symm-mem/162243 2025-09-07T07:51:36.7720506Z * [new tag] ciflow/h100-symm-mem/162320 -> ciflow/h100-symm-mem/162320 2025-09-07T07:51:36.7721667Z * [new tag] ciflow/h100/159158 -> ciflow/h100/159158 2025-09-07T07:51:36.7722960Z * [new tag] ciflow/h100/160480 -> ciflow/h100/160480 2025-09-07T07:51:36.7723902Z * [new tag] ciflow/h100/161749 -> ciflow/h100/161749 2025-09-07T07:51:36.7725202Z * [new tag] ciflow/h100/162022 -> ciflow/h100/162022 2025-09-07T07:51:36.7726057Z * [new tag] ciflow/h100/162278 -> ciflow/h100/162278 2025-09-07T07:51:36.7727470Z * [new tag] ciflow/inductor-perf-test-nightly-rocm/156592 -> ciflow/inductor-perf-test-nightly-rocm/156592 2025-09-07T07:51:36.7728742Z * [new tag] ciflow/inductor-perf-test-nightly/156592 -> ciflow/inductor-perf-test-nightly/156592 2025-09-07T07:51:36.7730037Z * [new tag] ciflow/inductor-periodic/162063 -> ciflow/inductor-periodic/162063 2025-09-07T07:51:36.7730937Z * [new tag] ciflow/inductor-periodic/162227 -> ciflow/inductor-periodic/162227 2025-09-07T07:51:36.7731890Z * [new tag] ciflow/inductor-periodic/162323 -> ciflow/inductor-periodic/162323 2025-09-07T07:51:36.7733205Z * [new tag] ciflow/inductor-rocm/154170 -> ciflow/inductor-rocm/154170 2025-09-07T07:51:36.7734347Z * [new tag] ciflow/inductor-rocm/159146 -> ciflow/inductor-rocm/159146 2025-09-07T07:51:36.7735324Z * [new tag] ciflow/inductor-rocm/159158 -> ciflow/inductor-rocm/159158 2025-09-07T07:51:36.7736559Z * [new tag] ciflow/inductor-rocm/161715 -> ciflow/inductor-rocm/161715 2025-09-07T07:51:36.7737604Z * [new tag] ciflow/inductor-rocm/162053 -> ciflow/inductor-rocm/162053 2025-09-07T07:51:36.7738856Z * [new tag] ciflow/inductor-rocm/162056 -> ciflow/inductor-rocm/162056 2025-09-07T07:51:36.7740032Z * [new tag] ciflow/inductor/137400 -> ciflow/inductor/137400 2025-09-07T07:51:36.7740899Z * [new tag] ciflow/inductor/148180 -> ciflow/inductor/148180 2025-09-07T07:51:36.7741841Z * [new tag] ciflow/inductor/148328 -> ciflow/inductor/148328 2025-09-07T07:51:36.7742744Z * [new tag] ciflow/inductor/148484 -> ciflow/inductor/148484 2025-09-07T07:51:36.7743585Z * [new tag] ciflow/inductor/148492 -> ciflow/inductor/148492 2025-09-07T07:51:36.7744412Z * [new tag] ciflow/inductor/152624 -> ciflow/inductor/152624 2025-09-07T07:51:36.7745484Z * [new tag] ciflow/inductor/154694 -> ciflow/inductor/154694 2025-09-07T07:51:36.7746374Z * [new tag] ciflow/inductor/156049 -> ciflow/inductor/156049 2025-09-07T07:51:36.7747230Z * [new tag] ciflow/inductor/156592 -> ciflow/inductor/156592 2025-09-07T07:51:36.7748088Z * [new tag] ciflow/inductor/157635 -> ciflow/inductor/157635 2025-09-07T07:51:36.7749096Z * [new tag] ciflow/inductor/157685 -> ciflow/inductor/157685 2025-09-07T07:51:36.7750120Z * [new tag] ciflow/inductor/157686 -> ciflow/inductor/157686 2025-09-07T07:51:36.7751030Z * [new tag] ciflow/inductor/157689 -> ciflow/inductor/157689 2025-09-07T07:51:36.7751912Z * [new tag] ciflow/inductor/157699 -> ciflow/inductor/157699 2025-09-07T07:51:36.7752921Z * [new tag] ciflow/inductor/157743 -> ciflow/inductor/157743 2025-09-07T07:51:36.7753844Z * [new tag] ciflow/inductor/157994 -> ciflow/inductor/157994 2025-09-07T07:51:36.7754771Z * [new tag] ciflow/inductor/158091 -> ciflow/inductor/158091 2025-09-07T07:51:36.7756004Z * [new tag] ciflow/inductor/158104 -> ciflow/inductor/158104 2025-09-07T07:51:36.7756867Z * [new tag] ciflow/inductor/158404 -> ciflow/inductor/158404 2025-09-07T07:51:36.7757790Z * [new tag] ciflow/inductor/158647 -> ciflow/inductor/158647 2025-09-07T07:51:36.7759211Z * [new tag] ciflow/inductor/158932 -> ciflow/inductor/158932 2025-09-07T07:51:36.7760067Z * [new tag] ciflow/inductor/159146 -> ciflow/inductor/159146 2025-09-07T07:51:36.7761131Z * [new tag] ciflow/inductor/159158 -> ciflow/inductor/159158 2025-09-07T07:51:36.7762035Z * [new tag] ciflow/inductor/159274 -> ciflow/inductor/159274 2025-09-07T07:51:36.7762946Z * [new tag] ciflow/inductor/159664 -> ciflow/inductor/159664 2025-09-07T07:51:36.7763941Z * [new tag] ciflow/inductor/159778 -> ciflow/inductor/159778 2025-09-07T07:51:36.7764853Z * [new tag] ciflow/inductor/159835 -> ciflow/inductor/159835 2025-09-07T07:51:36.7766151Z * [new tag] ciflow/inductor/159944 -> ciflow/inductor/159944 2025-09-07T07:51:36.7767281Z * [new tag] ciflow/inductor/160161 -> ciflow/inductor/160161 2025-09-07T07:51:36.7768164Z * [new tag] ciflow/inductor/160174 -> ciflow/inductor/160174 2025-09-07T07:51:36.7769209Z * [new tag] ciflow/inductor/160323 -> ciflow/inductor/160323 2025-09-07T07:51:36.7770359Z * [new tag] ciflow/inductor/160324 -> ciflow/inductor/160324 2025-09-07T07:51:36.7771543Z * [new tag] ciflow/inductor/160325 -> ciflow/inductor/160325 2025-09-07T07:51:36.7772621Z * [new tag] ciflow/inductor/160326 -> ciflow/inductor/160326 2025-09-07T07:51:36.7773603Z * [new tag] ciflow/inductor/160327 -> ciflow/inductor/160327 2025-09-07T07:51:36.7774593Z * [new tag] ciflow/inductor/160328 -> ciflow/inductor/160328 2025-09-07T07:51:36.7776061Z * [new tag] ciflow/inductor/160329 -> ciflow/inductor/160329 2025-09-07T07:51:36.7776930Z * [new tag] ciflow/inductor/160480 -> ciflow/inductor/160480 2025-09-07T07:51:36.7777991Z * [new tag] ciflow/inductor/160532 -> ciflow/inductor/160532 2025-09-07T07:51:36.7779515Z * [new tag] ciflow/inductor/160539 -> ciflow/inductor/160539 2025-09-07T07:51:36.7780439Z * [new tag] ciflow/inductor/160580 -> ciflow/inductor/160580 2025-09-07T07:51:36.7781294Z * [new tag] ciflow/inductor/160685 -> ciflow/inductor/160685 2025-09-07T07:51:36.7782352Z * [new tag] ciflow/inductor/160686 -> ciflow/inductor/160686 2025-09-07T07:51:36.7783238Z * [new tag] ciflow/inductor/160687 -> ciflow/inductor/160687 2025-09-07T07:51:36.7784171Z * [new tag] ciflow/inductor/160688 -> ciflow/inductor/160688 2025-09-07T07:51:36.7785176Z * [new tag] ciflow/inductor/160690 -> ciflow/inductor/160690 2025-09-07T07:51:36.7786217Z * [new tag] ciflow/inductor/160706 -> ciflow/inductor/160706 2025-09-07T07:51:36.7787201Z * [new tag] ciflow/inductor/160729 -> ciflow/inductor/160729 2025-09-07T07:51:36.7788342Z * [new tag] ciflow/inductor/160798 -> ciflow/inductor/160798 2025-09-07T07:51:36.7789561Z * [new tag] ciflow/inductor/160836 -> ciflow/inductor/160836 2025-09-07T07:51:36.7790661Z * [new tag] ciflow/inductor/160843 -> ciflow/inductor/160843 2025-09-07T07:51:36.7791918Z * [new tag] ciflow/inductor/160869 -> ciflow/inductor/160869 2025-09-07T07:51:36.7792870Z * [new tag] ciflow/inductor/160920 -> ciflow/inductor/160920 2025-09-07T07:51:36.7793814Z * [new tag] ciflow/inductor/160928 -> ciflow/inductor/160928 2025-09-07T07:51:36.7794768Z * [new tag] ciflow/inductor/160943 -> ciflow/inductor/160943 2025-09-07T07:51:36.7795901Z * [new tag] ciflow/inductor/161092 -> ciflow/inductor/161092 2025-09-07T07:51:36.7796867Z * [new tag] ciflow/inductor/161093 -> ciflow/inductor/161093 2025-09-07T07:51:36.7797981Z * [new tag] ciflow/inductor/161109 -> ciflow/inductor/161109 2025-09-07T07:51:36.7799113Z * [new tag] ciflow/inductor/161118 -> ciflow/inductor/161118 2025-09-07T07:51:36.7800420Z * [new tag] ciflow/inductor/161178 -> ciflow/inductor/161178 2025-09-07T07:51:36.7801275Z * [new tag] ciflow/inductor/161246 -> ciflow/inductor/161246 2025-09-07T07:51:36.7802245Z * [new tag] ciflow/inductor/161349 -> ciflow/inductor/161349 2025-09-07T07:51:36.7803131Z * [new tag] ciflow/inductor/161350 -> ciflow/inductor/161350 2025-09-07T07:51:36.7804098Z * [new tag] ciflow/inductor/161351 -> ciflow/inductor/161351 2025-09-07T07:51:36.7805303Z * [new tag] ciflow/inductor/161397 -> ciflow/inductor/161397 2025-09-07T07:51:36.7806365Z * [new tag] ciflow/inductor/161404 -> ciflow/inductor/161404 2025-09-07T07:51:36.7807444Z * [new tag] ciflow/inductor/161405 -> ciflow/inductor/161405 2025-09-07T07:51:36.7808398Z * [new tag] ciflow/inductor/161406 -> ciflow/inductor/161406 2025-09-07T07:51:36.7809576Z * [new tag] ciflow/inductor/161410 -> ciflow/inductor/161410 2025-09-07T07:51:36.7810525Z * [new tag] ciflow/inductor/161414 -> ciflow/inductor/161414 2025-09-07T07:51:36.7811759Z * [new tag] ciflow/inductor/161442 -> ciflow/inductor/161442 2025-09-07T07:51:36.7812732Z * [new tag] ciflow/inductor/161458 -> ciflow/inductor/161458 2025-09-07T07:51:36.7813730Z * [new tag] ciflow/inductor/161468 -> ciflow/inductor/161468 2025-09-07T07:51:36.7814673Z * [new tag] ciflow/inductor/161469 -> ciflow/inductor/161469 2025-09-07T07:51:36.7816080Z * [new tag] ciflow/inductor/161485 -> ciflow/inductor/161485 2025-09-07T07:51:36.7817044Z * [new tag] ciflow/inductor/161499 -> ciflow/inductor/161499 2025-09-07T07:51:36.7818022Z * [new tag] ciflow/inductor/161534 -> ciflow/inductor/161534 2025-09-07T07:51:36.7818960Z * [new tag] ciflow/inductor/161595 -> ciflow/inductor/161595 2025-09-07T07:51:36.7819892Z * [new tag] ciflow/inductor/161596 -> ciflow/inductor/161596 2025-09-07T07:51:36.7821306Z * [new tag] ciflow/inductor/161630 -> ciflow/inductor/161630 2025-09-07T07:51:36.7822505Z * [new tag] ciflow/inductor/161667 -> ciflow/inductor/161667 2025-09-07T07:51:36.7823464Z * [new tag] ciflow/inductor/161670 -> ciflow/inductor/161670 2025-09-07T07:51:36.7824431Z * [new tag] ciflow/inductor/161673 -> ciflow/inductor/161673 2025-09-07T07:51:36.7825645Z * [new tag] ciflow/inductor/161674 -> ciflow/inductor/161674 2025-09-07T07:51:36.7826783Z * [new tag] ciflow/inductor/161675 -> ciflow/inductor/161675 2025-09-07T07:51:36.7827636Z * [new tag] ciflow/inductor/161693 -> ciflow/inductor/161693 2025-09-07T07:51:36.7828675Z * [new tag] ciflow/inductor/161695 -> ciflow/inductor/161695 2025-09-07T07:51:36.7829835Z * [new tag] ciflow/inductor/161715 -> ciflow/inductor/161715 2025-09-07T07:51:36.7830893Z * [new tag] ciflow/inductor/161730 -> ciflow/inductor/161730 2025-09-07T07:51:36.7831905Z * [new tag] ciflow/inductor/161732 -> ciflow/inductor/161732 2025-09-07T07:51:36.7832998Z * [new tag] ciflow/inductor/161744 -> ciflow/inductor/161744 2025-09-07T07:51:36.7834018Z * [new tag] ciflow/inductor/161746 -> ciflow/inductor/161746 2025-09-07T07:51:36.7835060Z * [new tag] ciflow/inductor/161747 -> ciflow/inductor/161747 2025-09-07T07:51:36.7836216Z * [new tag] ciflow/inductor/161819 -> ciflow/inductor/161819 2025-09-07T07:51:36.7837218Z * [new tag] ciflow/inductor/161821 -> ciflow/inductor/161821 2025-09-07T07:51:36.7838362Z * [new tag] ciflow/inductor/161828 -> ciflow/inductor/161828 2025-09-07T07:51:36.7839693Z * [new tag] ciflow/inductor/161879 -> ciflow/inductor/161879 2025-09-07T07:51:36.7840526Z * [new tag] ciflow/inductor/161880 -> ciflow/inductor/161880 2025-09-07T07:51:36.7841506Z * [new tag] ciflow/inductor/161881 -> ciflow/inductor/161881 2025-09-07T07:51:36.7842723Z * [new tag] ciflow/inductor/161907 -> ciflow/inductor/161907 2025-09-07T07:51:36.7843654Z * [new tag] ciflow/inductor/161914 -> ciflow/inductor/161914 2025-09-07T07:51:36.7844833Z * [new tag] ciflow/inductor/161924 -> ciflow/inductor/161924 2025-09-07T07:51:36.7846208Z * [new tag] ciflow/inductor/161936 -> ciflow/inductor/161936 2025-09-07T07:51:36.7847318Z * [new tag] ciflow/inductor/161938 -> ciflow/inductor/161938 2025-09-07T07:51:36.7848289Z * [new tag] ciflow/inductor/161939 -> ciflow/inductor/161939 2025-09-07T07:51:36.7849301Z * [new tag] ciflow/inductor/161940 -> ciflow/inductor/161940 2025-09-07T07:51:36.7850286Z * [new tag] ciflow/inductor/161955 -> ciflow/inductor/161955 2025-09-07T07:51:36.7851318Z * [new tag] ciflow/inductor/161957 -> ciflow/inductor/161957 2025-09-07T07:51:36.7852315Z * [new tag] ciflow/inductor/161975 -> ciflow/inductor/161975 2025-09-07T07:51:36.7853371Z * [new tag] ciflow/inductor/161977 -> ciflow/inductor/161977 2025-09-07T07:51:36.7854387Z * [new tag] ciflow/inductor/161978 -> ciflow/inductor/161978 2025-09-07T07:51:36.7855592Z * [new tag] ciflow/inductor/161979 -> ciflow/inductor/161979 2025-09-07T07:51:36.7856741Z * [new tag] ciflow/inductor/161980 -> ciflow/inductor/161980 2025-09-07T07:51:36.7857754Z * [new tag] ciflow/inductor/161988 -> ciflow/inductor/161988 2025-09-07T07:51:36.7858847Z * [new tag] ciflow/inductor/161994 -> ciflow/inductor/161994 2025-09-07T07:51:36.7860012Z * [new tag] ciflow/inductor/162013 -> ciflow/inductor/162013 2025-09-07T07:51:36.7861146Z * [new tag] ciflow/inductor/162014 -> ciflow/inductor/162014 2025-09-07T07:51:36.7862287Z * [new tag] ciflow/inductor/162017 -> ciflow/inductor/162017 2025-09-07T07:51:36.7863291Z * [new tag] ciflow/inductor/162021 -> ciflow/inductor/162021 2025-09-07T07:51:36.7864237Z * [new tag] ciflow/inductor/162023 -> ciflow/inductor/162023 2025-09-07T07:51:36.7865440Z * [new tag] ciflow/inductor/162027 -> ciflow/inductor/162027 2025-09-07T07:51:36.7866559Z * [new tag] ciflow/inductor/162029 -> ciflow/inductor/162029 2025-09-07T07:51:36.7867622Z * [new tag] ciflow/inductor/162030 -> ciflow/inductor/162030 2025-09-07T07:51:36.7868676Z * [new tag] ciflow/inductor/162031 -> ciflow/inductor/162031 2025-09-07T07:51:36.7869679Z * [new tag] ciflow/inductor/162033 -> ciflow/inductor/162033 2025-09-07T07:51:36.7870989Z * [new tag] ciflow/inductor/162052 -> ciflow/inductor/162052 2025-09-07T07:51:36.7872017Z * [new tag] ciflow/inductor/162053 -> ciflow/inductor/162053 2025-09-07T07:51:36.7873056Z * [new tag] ciflow/inductor/162056 -> ciflow/inductor/162056 2025-09-07T07:51:36.7874206Z * [new tag] ciflow/inductor/162063 -> ciflow/inductor/162063 2025-09-07T07:51:36.7875419Z * [new tag] ciflow/inductor/162066 -> ciflow/inductor/162066 2025-09-07T07:51:36.7876596Z * [new tag] ciflow/inductor/162068 -> ciflow/inductor/162068 2025-09-07T07:51:36.7877844Z * [new tag] ciflow/inductor/162081 -> ciflow/inductor/162081 2025-09-07T07:51:36.7879300Z * [new tag] ciflow/inductor/162088 -> ciflow/inductor/162088 2025-09-07T07:51:36.7880158Z * [new tag] ciflow/inductor/162089 -> ciflow/inductor/162089 2025-09-07T07:51:36.7881172Z * [new tag] ciflow/inductor/162094 -> ciflow/inductor/162094 2025-09-07T07:51:36.7882227Z * [new tag] ciflow/inductor/162098 -> ciflow/inductor/162098 2025-09-07T07:51:36.7883281Z * [new tag] ciflow/inductor/162101 -> ciflow/inductor/162101 2025-09-07T07:51:36.7884313Z * [new tag] ciflow/inductor/162102 -> ciflow/inductor/162102 2025-09-07T07:51:36.7885534Z * [new tag] ciflow/inductor/162104 -> ciflow/inductor/162104 2025-09-07T07:51:36.7886662Z * [new tag] ciflow/inductor/162106 -> ciflow/inductor/162106 2025-09-07T07:51:36.7887703Z * [new tag] ciflow/inductor/162108 -> ciflow/inductor/162108 2025-09-07T07:51:36.7888849Z * [new tag] ciflow/inductor/162126 -> ciflow/inductor/162126 2025-09-07T07:51:36.7889884Z * [new tag] ciflow/inductor/162149 -> ciflow/inductor/162149 2025-09-07T07:51:36.7890941Z * [new tag] ciflow/inductor/162164 -> ciflow/inductor/162164 2025-09-07T07:51:36.7892022Z * [new tag] ciflow/inductor/162166 -> ciflow/inductor/162166 2025-09-07T07:51:36.7893093Z * [new tag] ciflow/inductor/162169 -> ciflow/inductor/162169 2025-09-07T07:51:36.7894157Z * [new tag] ciflow/inductor/162170 -> ciflow/inductor/162170 2025-09-07T07:51:36.7895282Z * [new tag] ciflow/inductor/162171 -> ciflow/inductor/162171 2025-09-07T07:51:36.7896458Z * [new tag] ciflow/inductor/162183 -> ciflow/inductor/162183 2025-09-07T07:51:36.7897551Z * [new tag] ciflow/inductor/162189 -> ciflow/inductor/162189 2025-09-07T07:51:36.7898640Z * [new tag] ciflow/inductor/162190 -> ciflow/inductor/162190 2025-09-07T07:51:36.7899678Z * [new tag] ciflow/inductor/162191 -> ciflow/inductor/162191 2025-09-07T07:51:36.7900723Z * [new tag] ciflow/inductor/162194 -> ciflow/inductor/162194 2025-09-07T07:51:36.7902076Z * [new tag] ciflow/inductor/162200 -> ciflow/inductor/162200 2025-09-07T07:51:36.7903221Z * [new tag] ciflow/inductor/162201 -> ciflow/inductor/162201 2025-09-07T07:51:36.7904351Z * [new tag] ciflow/inductor/162208 -> ciflow/inductor/162208 2025-09-07T07:51:36.7905911Z * [new tag] ciflow/inductor/162211 -> ciflow/inductor/162211 2025-09-07T07:51:36.7907026Z * [new tag] ciflow/inductor/162216 -> ciflow/inductor/162216 2025-09-07T07:51:36.7908149Z * [new tag] ciflow/inductor/162220 -> ciflow/inductor/162220 2025-09-07T07:51:36.7909529Z * [new tag] ciflow/inductor/162222 -> ciflow/inductor/162222 2025-09-07T07:51:36.7910842Z * [new tag] ciflow/inductor/162227 -> ciflow/inductor/162227 2025-09-07T07:51:36.7911871Z * [new tag] ciflow/inductor/162238 -> ciflow/inductor/162238 2025-09-07T07:51:36.7912995Z * [new tag] ciflow/inductor/162239 -> ciflow/inductor/162239 2025-09-07T07:51:36.7914067Z * [new tag] ciflow/inductor/162240 -> ciflow/inductor/162240 2025-09-07T07:51:36.7915263Z * [new tag] ciflow/inductor/162244 -> ciflow/inductor/162244 2025-09-07T07:51:36.7916484Z * [new tag] ciflow/inductor/162245 -> ciflow/inductor/162245 2025-09-07T07:51:36.7917607Z * [new tag] ciflow/inductor/162262 -> ciflow/inductor/162262 2025-09-07T07:51:36.7918937Z * [new tag] ciflow/inductor/162275 -> ciflow/inductor/162275 2025-09-07T07:51:36.7920021Z * [new tag] ciflow/inductor/162278 -> ciflow/inductor/162278 2025-09-07T07:51:36.7921276Z * [new tag] ciflow/inductor/162284 -> ciflow/inductor/162284 2025-09-07T07:51:36.7922202Z * [new tag] ciflow/inductor/162286 -> ciflow/inductor/162286 2025-09-07T07:51:36.7923243Z * [new tag] ciflow/inductor/162288 -> ciflow/inductor/162288 2025-09-07T07:51:36.7924315Z * [new tag] ciflow/inductor/162293 -> ciflow/inductor/162293 2025-09-07T07:51:36.7925632Z * [new tag] ciflow/inductor/162294 -> ciflow/inductor/162294 2025-09-07T07:51:36.7926851Z * [new tag] ciflow/inductor/162295 -> ciflow/inductor/162295 2025-09-07T07:51:36.7928012Z * [new tag] ciflow/inductor/162296 -> ciflow/inductor/162296 2025-09-07T07:51:36.7929356Z * [new tag] ciflow/inductor/162298 -> ciflow/inductor/162298 2025-09-07T07:51:36.7930519Z * [new tag] ciflow/inductor/162307 -> ciflow/inductor/162307 2025-09-07T07:51:36.7931668Z * [new tag] ciflow/inductor/162309 -> ciflow/inductor/162309 2025-09-07T07:51:36.7932841Z * [new tag] ciflow/inductor/162311 -> ciflow/inductor/162311 2025-09-07T07:51:36.7933958Z * [new tag] ciflow/inductor/162312 -> ciflow/inductor/162312 2025-09-07T07:51:36.7935165Z * [new tag] ciflow/inductor/162315 -> ciflow/inductor/162315 2025-09-07T07:51:36.7936466Z * [new tag] ciflow/inductor/162316 -> ciflow/inductor/162316 2025-09-07T07:51:36.7937534Z * [new tag] ciflow/inductor/162318 -> ciflow/inductor/162318 2025-09-07T07:51:36.7938652Z * [new tag] ciflow/inductor/162323 -> ciflow/inductor/162323 2025-09-07T07:51:36.7939800Z * [new tag] ciflow/inductor/162341 -> ciflow/inductor/162341 2025-09-07T07:51:36.7940920Z * [new tag] ciflow/inductor/162345 -> ciflow/inductor/162345 2025-09-07T07:51:36.7942368Z * [new tag] ciflow/inductor/3b9a386 -> ciflow/inductor/3b9a386 2025-09-07T07:51:36.7943596Z * [new tag] ciflow/inductor/3d4b92b -> ciflow/inductor/3d4b92b 2025-09-07T07:51:36.7944861Z * [new tag] ciflow/inductor/d224ac7 -> ciflow/inductor/d224ac7 2025-09-07T07:51:36.7946230Z * [new tag] ciflow/linux-aarch64/157994 -> ciflow/linux-aarch64/157994 2025-09-07T07:51:36.7947081Z * [new tag] ciflow/linux-aarch64/159737 -> ciflow/linux-aarch64/159737 2025-09-07T07:51:36.7947956Z * [new tag] ciflow/linux-aarch64/160078 -> ciflow/linux-aarch64/160078 2025-09-07T07:51:36.7949188Z * [new tag] ciflow/mps/157553 -> ciflow/mps/157553 2025-09-07T07:51:36.7950057Z * [new tag] ciflow/mps/157635 -> ciflow/mps/157635 2025-09-07T07:51:36.7950893Z * [new tag] ciflow/mps/161988 -> ciflow/mps/161988 2025-09-07T07:51:36.7951773Z * [new tag] ciflow/mps/162108 -> ciflow/mps/162108 2025-09-07T07:51:36.7952645Z * [new tag] ciflow/mps/162153 -> ciflow/mps/162153 2025-09-07T07:51:36.7953488Z * [new tag] ciflow/mps/162281 -> ciflow/mps/162281 2025-09-07T07:51:36.7954647Z * [new tag] ciflow/nightly/156049 -> ciflow/nightly/156049 2025-09-07T07:51:36.7955881Z * [new tag] ciflow/nightly/158104 -> ciflow/nightly/158104 2025-09-07T07:51:36.7956950Z * [new tag] ciflow/op-benchmark/157994 -> ciflow/op-benchmark/157994 2025-09-07T07:51:36.7958402Z * [new tag] ciflow/periodic-rocm-mi300/161529 -> ciflow/periodic-rocm-mi300/161529 2025-09-07T07:51:36.7959478Z * [new tag] ciflow/periodic-rocm-mi300/161715 -> ciflow/periodic-rocm-mi300/161715 2025-09-07T07:51:36.7960871Z * [new tag] ciflow/periodic/054a2fd -> ciflow/periodic/054a2fd 2025-09-07T07:51:36.7961962Z * [new tag] ciflow/periodic/156703 -> ciflow/periodic/156703 2025-09-07T07:51:36.7962605Z * [new tag] ciflow/periodic/161715 -> ciflow/periodic/161715 2025-09-07T07:51:36.7963478Z * [new tag] ciflow/periodic/162021 -> ciflow/periodic/162021 2025-09-07T07:51:36.7964303Z * [new tag] ciflow/periodic/162323 -> ciflow/periodic/162323 2025-09-07T07:51:36.7965613Z * [new tag] ciflow/periodic/2a6d37d -> ciflow/periodic/2a6d37d 2025-09-07T07:51:36.7966648Z * [new tag] ciflow/periodic/317eeb8 -> ciflow/periodic/317eeb8 2025-09-07T07:51:36.7967612Z * [new tag] ciflow/periodic/3c32 -> ciflow/periodic/3c32 2025-09-07T07:51:36.7968690Z * [new tag] ciflow/periodic/3e98831 -> ciflow/periodic/3e98831 2025-09-07T07:51:36.7969853Z * [new tag] ciflow/periodic/94512-point -> ciflow/periodic/94512-point 2025-09-07T07:51:36.7971309Z * [new tag] ciflow/periodic/csl/test87519 -> ciflow/periodic/csl/test87519 2025-09-07T07:51:36.7972335Z * [new tag] ciflow/periodic/csltest88275 -> ciflow/periodic/csltest88275 2025-09-07T07:51:36.7973422Z * [new tag] ciflow/periodic/csltest88761 -> ciflow/periodic/csltest88761 2025-09-07T07:51:36.7974554Z * [new tag] ciflow/periodic/release_1.12 -> ciflow/periodic/release_1.12 2025-09-07T07:51:36.7976269Z * [new tag] ciflow/periodic/release_1.12.0 -> ciflow/periodic/release_1.12.0 2025-09-07T07:51:36.7977425Z * [new tag] ciflow/periodic/sha-ec5b83 -> ciflow/periodic/sha-ec5b83 2025-09-07T07:51:36.7978573Z * [new tag] ciflow/rocm-mi300/154170 -> ciflow/rocm-mi300/154170 2025-09-07T07:51:36.7979553Z * [new tag] ciflow/rocm-mi300/158747 -> ciflow/rocm-mi300/158747 2025-09-07T07:51:36.7980409Z * [new tag] ciflow/rocm-mi300/159146 -> ciflow/rocm-mi300/159146 2025-09-07T07:51:36.7981244Z * [new tag] ciflow/rocm-mi300/159158 -> ciflow/rocm-mi300/159158 2025-09-07T07:51:36.7982256Z * [new tag] ciflow/rocm-mi300/161715 -> ciflow/rocm-mi300/161715 2025-09-07T07:51:36.7983068Z * [new tag] ciflow/rocm-mi300/161957 -> ciflow/rocm-mi300/161957 2025-09-07T07:51:36.7983886Z * [new tag] ciflow/rocm-mi300/162053 -> ciflow/rocm-mi300/162053 2025-09-07T07:51:36.7984700Z * [new tag] ciflow/rocm-mi300/162056 -> ciflow/rocm-mi300/162056 2025-09-07T07:51:36.7986010Z * [new tag] ciflow/rocm-mi300/162112 -> ciflow/rocm-mi300/162112 2025-09-07T07:51:36.7986823Z * [new tag] ciflow/rocm-mi300/162245 -> ciflow/rocm-mi300/162245 2025-09-07T07:51:36.7987686Z * [new tag] ciflow/rocm-mi300/162278 -> ciflow/rocm-mi300/162278 2025-09-07T07:51:36.7988586Z * [new tag] ciflow/rocm-mi300/162288 -> ciflow/rocm-mi300/162288 2025-09-07T07:51:36.7989919Z * [new tag] ciflow/rocm-mi355/162053 -> ciflow/rocm-mi355/162053 2025-09-07T07:51:36.7990876Z * [new tag] ciflow/rocm-mi355/162056 -> ciflow/rocm-mi355/162056 2025-09-07T07:51:36.7992005Z * [new tag] ciflow/rocm/148492 -> ciflow/rocm/148492 2025-09-07T07:51:36.7992845Z * [new tag] ciflow/rocm/154170 -> ciflow/rocm/154170 2025-09-07T07:51:36.7993918Z * [new tag] ciflow/rocm/156491 -> ciflow/rocm/156491 2025-09-07T07:51:36.7994725Z * [new tag] ciflow/rocm/156592 -> ciflow/rocm/156592 2025-09-07T07:51:36.7995755Z * [new tag] ciflow/rocm/158747 -> ciflow/rocm/158747 2025-09-07T07:51:36.7996651Z * [new tag] ciflow/rocm/159146 -> ciflow/rocm/159146 2025-09-07T07:51:36.7997684Z * [new tag] ciflow/rocm/159158 -> ciflow/rocm/159158 2025-09-07T07:51:36.7998688Z * [new tag] ciflow/rocm/161715 -> ciflow/rocm/161715 2025-09-07T07:51:36.7999561Z * [new tag] ciflow/rocm/161972 -> ciflow/rocm/161972 2025-09-07T07:51:36.8000370Z * [new tag] ciflow/rocm/162052 -> ciflow/rocm/162052 2025-09-07T07:51:36.8001232Z * [new tag] ciflow/rocm/162053 -> ciflow/rocm/162053 2025-09-07T07:51:36.8002084Z * [new tag] ciflow/rocm/162056 -> ciflow/rocm/162056 2025-09-07T07:51:36.8002975Z * [new tag] ciflow/rocm/162112 -> ciflow/rocm/162112 2025-09-07T07:51:36.8003873Z * [new tag] ciflow/rocm/162278 -> ciflow/rocm/162278 2025-09-07T07:51:36.8004758Z * [new tag] ciflow/rocm/162288 -> ciflow/rocm/162288 2025-09-07T07:51:36.8005919Z * [new tag] ciflow/rocm/162305 -> ciflow/rocm/162305 2025-09-07T07:51:36.8007207Z * [new tag] ciflow/slow/01c7106 -> ciflow/slow/01c7106 2025-09-07T07:51:36.8008283Z * [new tag] ciflow/slow/0577043 -> ciflow/slow/0577043 2025-09-07T07:51:36.8009896Z * [new tag] ciflow/slow/0d5b74da0cab798fbfdb9caa53fad816999c8386-sdym -> ciflow/slow/0d5b74da0cab798fbfdb9caa53fad816999c8386-sdym 2025-09-07T07:51:36.8010751Z * [new tag] ciflow/slow/0e81104 -> ciflow/slow/0e81104 2025-09-07T07:51:36.8011581Z * [new tag] ciflow/slow/161395 -> ciflow/slow/161395 2025-09-07T07:51:36.8012706Z * [new tag] ciflow/slow/1732077 -> ciflow/slow/1732077 2025-09-07T07:51:36.8013772Z * [new tag] ciflow/slow/187eb7c -> ciflow/slow/187eb7c 2025-09-07T07:51:36.8014734Z * [new tag] ciflow/slow/1faef89 -> ciflow/slow/1faef89 2025-09-07T07:51:36.8016004Z * [new tag] ciflow/slow/3920ec1 -> ciflow/slow/3920ec1 2025-09-07T07:51:36.8017022Z * [new tag] ciflow/slow/3b7c6b2 -> ciflow/slow/3b7c6b2 2025-09-07T07:51:36.8018216Z * [new tag] ciflow/slow/59a3759 -> ciflow/slow/59a3759 2025-09-07T07:51:36.8019419Z * [new tag] ciflow/slow/70ef0bb -> ciflow/slow/70ef0bb 2025-09-07T07:51:36.8020444Z * [new tag] ciflow/slow/788ff06 -> ciflow/slow/788ff06 2025-09-07T07:51:36.8021893Z * [new tag] ciflow/slow/8751002215790a3a88750faa8f4366933e296693-sdym -> ciflow/slow/8751002215790a3a88750faa8f4366933e296693-sdym 2025-09-07T07:51:36.8022791Z * [new tag] ciflow/slow/9d85864 -> ciflow/slow/9d85864 2025-09-07T07:51:36.8023742Z * [new tag] ciflow/slow/9ffad5b -> ciflow/slow/9ffad5b 2025-09-07T07:51:36.8024782Z * [new tag] ciflow/slow/a206e8b -> ciflow/slow/a206e8b 2025-09-07T07:51:36.8026172Z * [new tag] ciflow/slow/a837609 -> ciflow/slow/a837609 2025-09-07T07:51:36.8027151Z * [new tag] ciflow/slow/af841f3 -> ciflow/slow/af841f3 2025-09-07T07:51:36.8028484Z * [new tag] ciflow/slow/da3aba1e46157c4df504b067477cdf2b3c96b194-sdym -> ciflow/slow/da3aba1e46157c4df504b067477cdf2b3c96b194-sdym 2025-09-07T07:51:36.8029512Z * [new tag] ciflow/triton_binaries/162329 -> ciflow/triton_binaries/162329 2025-09-07T07:51:36.8030645Z * [new tag] ciflow/trunk/113258 -> ciflow/trunk/113258 2025-09-07T07:51:36.8031541Z * [new tag] ciflow/trunk/137400 -> ciflow/trunk/137400 2025-09-07T07:51:36.8032411Z * [new tag] ciflow/trunk/148180 -> ciflow/trunk/148180 2025-09-07T07:51:36.8033285Z * [new tag] ciflow/trunk/148328 -> ciflow/trunk/148328 2025-09-07T07:51:36.8034173Z * [new tag] ciflow/trunk/148492 -> ciflow/trunk/148492 2025-09-07T07:51:36.8035499Z * [new tag] ciflow/trunk/148919 -> ciflow/trunk/148919 2025-09-07T07:51:36.8036663Z * [new tag] ciflow/trunk/152624 -> ciflow/trunk/152624 2025-09-07T07:51:36.8037298Z * [new tag] ciflow/trunk/154170 -> ciflow/trunk/154170 2025-09-07T07:51:36.8038172Z * [new tag] ciflow/trunk/154694 -> ciflow/trunk/154694 2025-09-07T07:51:36.8039072Z * [new tag] ciflow/trunk/156049 -> ciflow/trunk/156049 2025-09-07T07:51:36.8039870Z * [new tag] ciflow/trunk/156703 -> ciflow/trunk/156703 2025-09-07T07:51:36.8040706Z * [new tag] ciflow/trunk/156711 -> ciflow/trunk/156711 2025-09-07T07:51:36.8041569Z * [new tag] ciflow/trunk/157432 -> ciflow/trunk/157432 2025-09-07T07:51:36.8042421Z * [new tag] ciflow/trunk/157685 -> ciflow/trunk/157685 2025-09-07T07:51:36.8043286Z * [new tag] ciflow/trunk/157689 -> ciflow/trunk/157689 2025-09-07T07:51:36.8044172Z * [new tag] ciflow/trunk/157699 -> ciflow/trunk/157699 2025-09-07T07:51:36.8045205Z * [new tag] ciflow/trunk/157813 -> ciflow/trunk/157813 2025-09-07T07:51:36.8046218Z * [new tag] ciflow/trunk/157994 -> ciflow/trunk/157994 2025-09-07T07:51:36.8047041Z * [new tag] ciflow/trunk/158091 -> ciflow/trunk/158091 2025-09-07T07:51:36.8047887Z * [new tag] ciflow/trunk/158104 -> ciflow/trunk/158104 2025-09-07T07:51:36.8048937Z * [new tag] ciflow/trunk/158404 -> ciflow/trunk/158404 2025-09-07T07:51:36.8049793Z * [new tag] ciflow/trunk/158647 -> ciflow/trunk/158647 2025-09-07T07:51:36.8050922Z * [new tag] ciflow/trunk/158846 -> ciflow/trunk/158846 2025-09-07T07:51:36.8051820Z * [new tag] ciflow/trunk/159158 -> ciflow/trunk/159158 2025-09-07T07:51:36.8052826Z * [new tag] ciflow/trunk/159682 -> ciflow/trunk/159682 2025-09-07T07:51:36.8053734Z * [new tag] ciflow/trunk/159835 -> ciflow/trunk/159835 2025-09-07T07:51:36.8054598Z * [new tag] ciflow/trunk/160161 -> ciflow/trunk/160161 2025-09-07T07:51:36.8055839Z * [new tag] ciflow/trunk/160236 -> ciflow/trunk/160236 2025-09-07T07:51:36.8056737Z * [new tag] ciflow/trunk/160329 -> ciflow/trunk/160329 2025-09-07T07:51:36.8057590Z * [new tag] ciflow/trunk/160480 -> ciflow/trunk/160480 2025-09-07T07:51:36.8058447Z * [new tag] ciflow/trunk/160532 -> ciflow/trunk/160532 2025-09-07T07:51:36.8059328Z * [new tag] ciflow/trunk/160836 -> ciflow/trunk/160836 2025-09-07T07:51:36.8060186Z * [new tag] ciflow/trunk/160843 -> ciflow/trunk/160843 2025-09-07T07:51:36.8061127Z * [new tag] ciflow/trunk/160869 -> ciflow/trunk/160869 2025-09-07T07:51:36.8062222Z * [new tag] ciflow/trunk/160928 -> ciflow/trunk/160928 2025-09-07T07:51:36.8063180Z * [new tag] ciflow/trunk/160940 -> ciflow/trunk/160940 2025-09-07T07:51:36.8064118Z * [new tag] ciflow/trunk/160943 -> ciflow/trunk/160943 2025-09-07T07:51:36.8065405Z * [new tag] ciflow/trunk/160953 -> ciflow/trunk/160953 2025-09-07T07:51:36.8066537Z * [new tag] ciflow/trunk/161035 -> ciflow/trunk/161035 2025-09-07T07:51:36.8067454Z * [new tag] ciflow/trunk/161178 -> ciflow/trunk/161178 2025-09-07T07:51:36.8068324Z * [new tag] ciflow/trunk/161349 -> ciflow/trunk/161349 2025-09-07T07:51:36.8069247Z * [new tag] ciflow/trunk/161350 -> ciflow/trunk/161350 2025-09-07T07:51:36.8070173Z * [new tag] ciflow/trunk/161351 -> ciflow/trunk/161351 2025-09-07T07:51:36.8071252Z * [new tag] ciflow/trunk/161395 -> ciflow/trunk/161395 2025-09-07T07:51:36.8072011Z * [new tag] ciflow/trunk/161405 -> ciflow/trunk/161405 2025-09-07T07:51:36.8072953Z * [new tag] ciflow/trunk/161406 -> ciflow/trunk/161406 2025-09-07T07:51:36.8073816Z * [new tag] ciflow/trunk/161410 -> ciflow/trunk/161410 2025-09-07T07:51:36.8074727Z * [new tag] ciflow/trunk/161468 -> ciflow/trunk/161468 2025-09-07T07:51:36.8075897Z * [new tag] ciflow/trunk/161499 -> ciflow/trunk/161499 2025-09-07T07:51:36.8077091Z * [new tag] ciflow/trunk/161527 -> ciflow/trunk/161527 2025-09-07T07:51:36.8078049Z * [new tag] ciflow/trunk/161534 -> ciflow/trunk/161534 2025-09-07T07:51:36.8078933Z * [new tag] ciflow/trunk/161591 -> ciflow/trunk/161591 2025-09-07T07:51:36.8079902Z * [new tag] ciflow/trunk/161595 -> ciflow/trunk/161595 2025-09-07T07:51:36.8080804Z * [new tag] ciflow/trunk/161596 -> ciflow/trunk/161596 2025-09-07T07:51:36.8081704Z * [new tag] ciflow/trunk/161633 -> ciflow/trunk/161633 2025-09-07T07:51:36.8082592Z * [new tag] ciflow/trunk/161634 -> ciflow/trunk/161634 2025-09-07T07:51:36.8083485Z * [new tag] ciflow/trunk/161635 -> ciflow/trunk/161635 2025-09-07T07:51:36.8084362Z * [new tag] ciflow/trunk/161667 -> ciflow/trunk/161667 2025-09-07T07:51:36.8085600Z * [new tag] ciflow/trunk/161670 -> ciflow/trunk/161670 2025-09-07T07:51:36.8086612Z * [new tag] ciflow/trunk/161692 -> ciflow/trunk/161692 2025-09-07T07:51:36.8087601Z * [new tag] ciflow/trunk/161693 -> ciflow/trunk/161693 2025-09-07T07:51:36.8088561Z * [new tag] ciflow/trunk/161695 -> ciflow/trunk/161695 2025-09-07T07:51:36.8089484Z * [new tag] ciflow/trunk/161730 -> ciflow/trunk/161730 2025-09-07T07:51:36.8090406Z * [new tag] ciflow/trunk/161744 -> ciflow/trunk/161744 2025-09-07T07:51:36.8091342Z * [new tag] ciflow/trunk/161749 -> ciflow/trunk/161749 2025-09-07T07:51:36.8092322Z * [new tag] ciflow/trunk/161881 -> ciflow/trunk/161881 2025-09-07T07:51:36.8093254Z * [new tag] ciflow/trunk/161924 -> ciflow/trunk/161924 2025-09-07T07:51:36.8094308Z * [new tag] ciflow/trunk/161926 -> ciflow/trunk/161926 2025-09-07T07:51:36.8095640Z * [new tag] ciflow/trunk/161936 -> ciflow/trunk/161936 2025-09-07T07:51:36.8096613Z * [new tag] ciflow/trunk/161952 -> ciflow/trunk/161952 2025-09-07T07:51:36.8097568Z * [new tag] ciflow/trunk/161955 -> ciflow/trunk/161955 2025-09-07T07:51:36.8098522Z * [new tag] ciflow/trunk/161957 -> ciflow/trunk/161957 2025-09-07T07:51:36.8099500Z * [new tag] ciflow/trunk/161959 -> ciflow/trunk/161959 2025-09-07T07:51:36.8100476Z * [new tag] ciflow/trunk/161977 -> ciflow/trunk/161977 2025-09-07T07:51:36.8101389Z * [new tag] ciflow/trunk/161988 -> ciflow/trunk/161988 2025-09-07T07:51:36.8102478Z * [new tag] ciflow/trunk/161994 -> ciflow/trunk/161994 2025-09-07T07:51:36.8103533Z * [new tag] ciflow/trunk/162007 -> ciflow/trunk/162007 2025-09-07T07:51:36.8104475Z * [new tag] ciflow/trunk/162013 -> ciflow/trunk/162013 2025-09-07T07:51:36.8105660Z * [new tag] ciflow/trunk/162017 -> ciflow/trunk/162017 2025-09-07T07:51:36.8106677Z * [new tag] ciflow/trunk/162021 -> ciflow/trunk/162021 2025-09-07T07:51:36.8107614Z * [new tag] ciflow/trunk/162022 -> ciflow/trunk/162022 2025-09-07T07:51:36.8108757Z * [new tag] ciflow/trunk/162040 -> ciflow/trunk/162040 2025-09-07T07:51:36.8109520Z * [new tag] ciflow/trunk/162041 -> ciflow/trunk/162041 2025-09-07T07:51:36.8127794Z * [new tag] ciflow/trunk/162062 -> ciflow/trunk/162062 2025-09-07T07:51:36.8128067Z * [new tag] ciflow/trunk/162066 -> ciflow/trunk/162066 2025-09-07T07:51:36.8128227Z * [new tag] ciflow/trunk/162089 -> ciflow/trunk/162089 2025-09-07T07:51:36.8128371Z * [new tag] ciflow/trunk/162099 -> ciflow/trunk/162099 2025-09-07T07:51:36.8128504Z * [new tag] ciflow/trunk/162104 -> ciflow/trunk/162104 2025-09-07T07:51:36.8128637Z * [new tag] ciflow/trunk/162106 -> ciflow/trunk/162106 2025-09-07T07:51:36.8128759Z * [new tag] ciflow/trunk/162112 -> ciflow/trunk/162112 2025-09-07T07:51:36.8128889Z * [new tag] ciflow/trunk/162119 -> ciflow/trunk/162119 2025-09-07T07:51:36.8129019Z * [new tag] ciflow/trunk/162142 -> ciflow/trunk/162142 2025-09-07T07:51:36.8129140Z * [new tag] ciflow/trunk/162169 -> ciflow/trunk/162169 2025-09-07T07:51:36.8129266Z * [new tag] ciflow/trunk/162183 -> ciflow/trunk/162183 2025-09-07T07:51:36.8129611Z * [new tag] ciflow/trunk/162190 -> ciflow/trunk/162190 2025-09-07T07:51:36.8129745Z * [new tag] ciflow/trunk/162194 -> ciflow/trunk/162194 2025-09-07T07:51:36.8129869Z * [new tag] ciflow/trunk/162200 -> ciflow/trunk/162200 2025-09-07T07:51:36.8129990Z * [new tag] ciflow/trunk/162206 -> ciflow/trunk/162206 2025-09-07T07:51:36.8130116Z * [new tag] ciflow/trunk/162208 -> ciflow/trunk/162208 2025-09-07T07:51:36.8130237Z * [new tag] ciflow/trunk/162222 -> ciflow/trunk/162222 2025-09-07T07:51:36.8130369Z * [new tag] ciflow/trunk/162238 -> ciflow/trunk/162238 2025-09-07T07:51:36.8130500Z * [new tag] ciflow/trunk/162244 -> ciflow/trunk/162244 2025-09-07T07:51:36.8130622Z * [new tag] ciflow/trunk/162267 -> ciflow/trunk/162267 2025-09-07T07:51:36.8130883Z * [new tag] ciflow/trunk/162269 -> ciflow/trunk/162269 2025-09-07T07:51:36.8132008Z * [new tag] ciflow/trunk/162278 -> ciflow/trunk/162278 2025-09-07T07:51:36.8132974Z * [new tag] ciflow/trunk/162286 -> ciflow/trunk/162286 2025-09-07T07:51:36.8133919Z * [new tag] ciflow/trunk/162288 -> ciflow/trunk/162288 2025-09-07T07:51:36.8134827Z * [new tag] ciflow/trunk/162293 -> ciflow/trunk/162293 2025-09-07T07:51:36.8136094Z * [new tag] ciflow/trunk/162310 -> ciflow/trunk/162310 2025-09-07T07:51:36.8137095Z * [new tag] ciflow/trunk/162311 -> ciflow/trunk/162311 2025-09-07T07:51:36.8138052Z * [new tag] ciflow/trunk/162315 -> ciflow/trunk/162315 2025-09-07T07:51:36.8139028Z * [new tag] ciflow/trunk/162325 -> ciflow/trunk/162325 2025-09-07T07:51:36.8140223Z * [new tag] ciflow/trunk/162328 -> ciflow/trunk/162328 2025-09-07T07:51:36.8141183Z * [new tag] ciflow/trunk/162329 -> ciflow/trunk/162329 2025-09-07T07:51:36.8142765Z * [new tag] ciflow/unstable/123 -> ciflow/unstable/123 2025-09-07T07:51:36.8143901Z * [new tag] ciflow/vllm/162292 -> ciflow/vllm/162292 2025-09-07T07:51:36.8145251Z * [new tag] ciflow/win-arm64/156049 -> ciflow/win-arm64/156049 2025-09-07T07:51:36.8146213Z * [new tag] ciflow/win-arm64/158104 -> ciflow/win-arm64/158104 2025-09-07T07:51:36.8147287Z * [new tag] ciflow/xpu/157699 -> ciflow/xpu/157699 2025-09-07T07:51:36.8148316Z * [new tag] ciflow/xpu/157994 -> ciflow/xpu/157994 2025-09-07T07:51:36.8149155Z * [new tag] ciflow/xpu/159459 -> ciflow/xpu/159459 2025-09-07T07:51:36.8150018Z * [new tag] ciflow/xpu/159718 -> ciflow/xpu/159718 2025-09-07T07:51:36.8150866Z * [new tag] ciflow/xpu/159944 -> ciflow/xpu/159944 2025-09-07T07:51:36.8151761Z * [new tag] ciflow/xpu/160867 -> ciflow/xpu/160867 2025-09-07T07:51:36.8152758Z * [new tag] ciflow/xpu/160938 -> ciflow/xpu/160938 2025-09-07T07:51:36.8153621Z * [new tag] ciflow/xpu/160940 -> ciflow/xpu/160940 2025-09-07T07:51:36.8154472Z * [new tag] ciflow/xpu/160953 -> ciflow/xpu/160953 2025-09-07T07:51:36.8155677Z * [new tag] ciflow/xpu/161045 -> ciflow/xpu/161045 2025-09-07T07:51:36.8156877Z * [new tag] ciflow/xpu/161058 -> ciflow/xpu/161058 2025-09-07T07:51:36.8157773Z * [new tag] ciflow/xpu/161246 -> ciflow/xpu/161246 2025-09-07T07:51:36.8158616Z * [new tag] ciflow/xpu/161397 -> ciflow/xpu/161397 2025-09-07T07:51:36.8159426Z * [new tag] ciflow/xpu/161485 -> ciflow/xpu/161485 2025-09-07T07:51:36.8160284Z * [new tag] ciflow/xpu/161988 -> ciflow/xpu/161988 2025-09-07T07:51:36.8161162Z * [new tag] ciflow/xpu/162062 -> ciflow/xpu/162062 2025-09-07T07:51:36.8162112Z * [new tag] cslpull75 -> cslpull75 2025-09-07T07:51:36.8163019Z * [new tag] cslpull76 -> cslpull76 2025-09-07T07:51:36.8164058Z * [new tag] cslpull77 -> cslpull77 2025-09-07T07:51:36.8165195Z * [new tag] cslpull78 -> cslpull78 2025-09-07T07:51:36.8166223Z * [new tag] cslpull79 -> cslpull79 2025-09-07T07:51:36.8167187Z * [new tag] cslpull80 -> cslpull80 2025-09-07T07:51:36.8168195Z * [new tag] cslpull81 -> cslpull81 2025-09-07T07:51:36.8169087Z * [new tag] cslpull82 -> cslpull82 2025-09-07T07:51:36.8170049Z * [new tag] cslpull83 -> cslpull83 2025-09-07T07:51:36.8171022Z * [new tag] cslpull84 -> cslpull84 2025-09-07T07:51:36.8172021Z * [new tag] cslpull85 -> cslpull85 2025-09-07T07:51:36.8172967Z * [new tag] cslpull86 -> cslpull86 2025-09-07T07:51:36.8173885Z * [new tag] cslpull87 -> cslpull87 2025-09-07T07:51:36.8175060Z * [new tag] cslpull88 -> cslpull88 2025-09-07T07:51:36.8176270Z * [new tag] cslpull89 -> cslpull89 2025-09-07T07:51:36.8176897Z * [new tag] cslpull90 -> cslpull90 2025-09-07T07:51:36.8178176Z * [new tag] cslpull91 -> cslpull91 2025-09-07T07:51:36.8179133Z * [new tag] cslpull92 -> cslpull92 2025-09-07T07:51:36.8180094Z * [new tag] flight_5 -> flight_5 2025-09-07T07:51:36.8181150Z * [new tag] flight_5.1 -> flight_5.1 2025-09-07T07:51:36.8182311Z * [new tag] flight_5.2 -> flight_5.2 2025-09-07T07:51:36.8183214Z * [new tag] flight_5.3 -> flight_5.3 2025-09-07T07:51:36.8184212Z * [new tag] forpull1 -> forpull1 2025-09-07T07:51:36.8185825Z * [new tag] malfet/tag-2ef5611 -> malfet/tag-2ef5611 2025-09-07T07:51:36.8186804Z * [new tag] malfet/tag-317b1a0 -> malfet/tag-317b1a0 2025-09-07T07:51:36.8187971Z * [new tag] malfet/tag-ec6f767 -> malfet/tag-ec6f767 2025-09-07T07:51:36.8188875Z * [new tag] nightly-binary -> nightly-binary 2025-09-07T07:51:36.8189764Z * [new tag] sqzhang_flight4_plus -> sqzhang_flight4_plus 2025-09-07T07:51:36.8190826Z * [new tag] sqzhang_flight_3 -> sqzhang_flight_3 2025-09-07T07:51:36.8192378Z * [new tag] trunk/00636e0171e7e733628c408084805442270cf608 -> trunk/00636e0171e7e733628c408084805442270cf608 2025-09-07T07:51:36.8193471Z * [new tag] trunk/019fed39aa6b2dd8c69347378d53423e5efae8d4 -> trunk/019fed39aa6b2dd8c69347378d53423e5efae8d4 2025-09-07T07:51:36.8194456Z * [new tag] trunk/01ab325cc2e0dc221af4d710974e1b9175066544 -> trunk/01ab325cc2e0dc221af4d710974e1b9175066544 2025-09-07T07:51:36.8195890Z * [new tag] trunk/01edcd4df8bf0c7b4cc2d3ec868bd2059eeea83b -> trunk/01edcd4df8bf0c7b4cc2d3ec868bd2059eeea83b 2025-09-07T07:51:36.8197035Z * [new tag] trunk/040d00af048967dde7938d358d7f5988cbd18388 -> trunk/040d00af048967dde7938d358d7f5988cbd18388 2025-09-07T07:51:36.8198055Z * [new tag] trunk/0447f2d99b4351b2ff129dce6eebb371024f73e5 -> trunk/0447f2d99b4351b2ff129dce6eebb371024f73e5 2025-09-07T07:51:36.8199083Z * [new tag] trunk/047603d35bdc70046216384838d6340feab79bf4 -> trunk/047603d35bdc70046216384838d6340feab79bf4 2025-09-07T07:51:36.8200040Z * [new tag] trunk/06da7c0730b3764f178ec3a90dedf4ffa4202d81 -> trunk/06da7c0730b3764f178ec3a90dedf4ffa4202d81 2025-09-07T07:51:36.8201275Z * [new tag] trunk/081cab045472ce045634548cc6c14a4870641e23 -> trunk/081cab045472ce045634548cc6c14a4870641e23 2025-09-07T07:51:36.8202276Z * [new tag] trunk/09587daf8c9f21f5340f73921ce5f23d1a4a4572 -> trunk/09587daf8c9f21f5340f73921ce5f23d1a4a4572 2025-09-07T07:51:36.8203252Z * [new tag] trunk/09be1890d72cc34fc946965dc4a27736bf0ca8c6 -> trunk/09be1890d72cc34fc946965dc4a27736bf0ca8c6 2025-09-07T07:51:36.8204348Z * [new tag] trunk/09d2f1b6315d6d416fbf452793d65795863ebc66 -> trunk/09d2f1b6315d6d416fbf452793d65795863ebc66 2025-09-07T07:51:36.8205510Z * [new tag] trunk/0af70e2353e1dcda83175fd4834ecb7b63e009e0 -> trunk/0af70e2353e1dcda83175fd4834ecb7b63e009e0 2025-09-07T07:51:36.8207027Z * [new tag] trunk/0c0e056a9e20c17271a6144dd32c0c7e3ba26736 -> trunk/0c0e056a9e20c17271a6144dd32c0c7e3ba26736 2025-09-07T07:51:36.8208055Z * [new tag] trunk/0cd6c56bdfa9178ff61be82ce3b178926ddb64a9 -> trunk/0cd6c56bdfa9178ff61be82ce3b178926ddb64a9 2025-09-07T07:51:36.8208990Z * [new tag] trunk/0d421ace32c1605ee8e452ee1eeb03bd243dd96c -> trunk/0d421ace32c1605ee8e452ee1eeb03bd243dd96c 2025-09-07T07:51:36.8210157Z * [new tag] trunk/0d71a9dd5b4b6d1dde58d91c9b71d96bc6a6a171 -> trunk/0d71a9dd5b4b6d1dde58d91c9b71d96bc6a6a171 2025-09-07T07:51:36.8211150Z * [new tag] trunk/0d84ff3b78f55492d3d4708458c92d776274939e -> trunk/0d84ff3b78f55492d3d4708458c92d776274939e 2025-09-07T07:51:36.8212115Z * [new tag] trunk/0f45aaf4414048b17d720d0915ce221a8de8ec63 -> trunk/0f45aaf4414048b17d720d0915ce221a8de8ec63 2025-09-07T07:51:36.8213131Z * [new tag] trunk/0ff8eabf1387de5acd6712a03bda61f1a3dfa27f -> trunk/0ff8eabf1387de5acd6712a03bda61f1a3dfa27f 2025-09-07T07:51:36.8214112Z * [new tag] trunk/104f2680e03d13a4765ca69f905d8f16fc0c822f -> trunk/104f2680e03d13a4765ca69f905d8f16fc0c822f 2025-09-07T07:51:36.8215279Z * [new tag] trunk/12814701555d3e41dfcdf8f9273af5821e322df0 -> trunk/12814701555d3e41dfcdf8f9273af5821e322df0 2025-09-07T07:51:36.8216459Z * [new tag] trunk/13b65196db422bdb394cb482e208c61ed448898c -> trunk/13b65196db422bdb394cb482e208c61ed448898c 2025-09-07T07:51:36.8217600Z * [new tag] trunk/13d66e2a66eceed14b8a8f5a971087df4f688a46 -> trunk/13d66e2a66eceed14b8a8f5a971087df4f688a46 2025-09-07T07:51:36.8218438Z * [new tag] trunk/145a3a7bda15e3963a33eb1b54bba5d4a270b225 -> trunk/145a3a7bda15e3963a33eb1b54bba5d4a270b225 2025-09-07T07:51:36.8219476Z * [new tag] trunk/146371483318e17929daefd37c8e459d9d6d47bb -> trunk/146371483318e17929daefd37c8e459d9d6d47bb 2025-09-07T07:51:36.8220634Z * [new tag] trunk/15c77a8cfd341e74fd124b077492ef2bfa51b339 -> trunk/15c77a8cfd341e74fd124b077492ef2bfa51b339 2025-09-07T07:51:36.8221836Z * [new tag] trunk/17fa8eec4a1e32939ab4d364ee6e75487a79b654 -> trunk/17fa8eec4a1e32939ab4d364ee6e75487a79b654 2025-09-07T07:51:36.8223296Z * [new tag] trunk/190c391a28845a14df26abb228d26aa813efb20c -> trunk/190c391a28845a14df26abb228d26aa813efb20c 2025-09-07T07:51:36.8224395Z * [new tag] trunk/1a588ace4667bde1331fbd8ed957157dca5cee68 -> trunk/1a588ace4667bde1331fbd8ed957157dca5cee68 2025-09-07T07:51:36.8225876Z * [new tag] trunk/1aa7476885e8f6e7b0ec3a5b6383aad9d3f343e7 -> trunk/1aa7476885e8f6e7b0ec3a5b6383aad9d3f343e7 2025-09-07T07:51:36.8226642Z * [new tag] trunk/1aeb421c342c9e9607842f4c87cb46e8e816ee53 -> trunk/1aeb421c342c9e9607842f4c87cb46e8e816ee53 2025-09-07T07:51:36.8227695Z * [new tag] trunk/1c1b28d5b6a942fafe23b2f09302d93c25226d4a -> trunk/1c1b28d5b6a942fafe23b2f09302d93c25226d4a 2025-09-07T07:51:36.8228721Z * [new tag] trunk/1ebd70d0c0d562d3be9abdee2a21906584af7d99 -> trunk/1ebd70d0c0d562d3be9abdee2a21906584af7d99 2025-09-07T07:51:36.8229832Z * [new tag] trunk/1ec2c15914da4ef7bd926ed9aebc8671c75fe965 -> trunk/1ec2c15914da4ef7bd926ed9aebc8671c75fe965 2025-09-07T07:51:36.8230877Z * [new tag] trunk/1f51056bd64e73d1aa81321bc3c098575b1bc78a -> trunk/1f51056bd64e73d1aa81321bc3c098575b1bc78a 2025-09-07T07:51:36.8231967Z * [new tag] trunk/1f820de639c75a1562d3fb03f160439f853ae07b -> trunk/1f820de639c75a1562d3fb03f160439f853ae07b 2025-09-07T07:51:36.8233041Z * [new tag] trunk/204697f0e695d82894c5010fbec664c4391f90cc -> trunk/204697f0e695d82894c5010fbec664c4391f90cc 2025-09-07T07:51:36.8234019Z * [new tag] trunk/20629b1619fe636227d01fc85ba221daa7185a05 -> trunk/20629b1619fe636227d01fc85ba221daa7185a05 2025-09-07T07:51:36.8235168Z * [new tag] trunk/20b47acef845e9c4f71da9429a396d293f50ebe7 -> trunk/20b47acef845e9c4f71da9429a396d293f50ebe7 2025-09-07T07:51:36.8236302Z * [new tag] trunk/20bfb2539d7c5250379648eda35f80b8a7d642dd -> trunk/20bfb2539d7c5250379648eda35f80b8a7d642dd 2025-09-07T07:51:36.8237394Z * [new tag] trunk/21fae99c180d17def562797ea0fb154d8fdf88e3 -> trunk/21fae99c180d17def562797ea0fb154d8fdf88e3 2025-09-07T07:51:36.8238714Z * [new tag] trunk/248355faf53f9f7ba2fd0a367d59600c6d991e7f -> trunk/248355faf53f9f7ba2fd0a367d59600c6d991e7f 2025-09-07T07:51:36.8239870Z * [new tag] trunk/25f4aaed9ec26f39c13862323ff8582006473d23 -> trunk/25f4aaed9ec26f39c13862323ff8582006473d23 2025-09-07T07:51:36.8240860Z * [new tag] trunk/261a84a1764412f8e659c956e3f81997ec3de9d5 -> trunk/261a84a1764412f8e659c956e3f81997ec3de9d5 2025-09-07T07:51:36.8242009Z * [new tag] trunk/28f4ab0737937858730f29f5c4e601e109cf9d5f -> trunk/28f4ab0737937858730f29f5c4e601e109cf9d5f 2025-09-07T07:51:36.8243157Z * [new tag] trunk/291cd11f2d5df6f48d348cce0e4e762f274f4dc4 -> trunk/291cd11f2d5df6f48d348cce0e4e762f274f4dc4 2025-09-07T07:51:36.8244231Z * [new tag] trunk/29280864d941e6108ab57f7298f520c0cf9696e9 -> trunk/29280864d941e6108ab57f7298f520c0cf9696e9 2025-09-07T07:51:36.8245554Z * [new tag] trunk/2a45837e98c63cae9d1a2e2133a727b829e549d5 -> trunk/2a45837e98c63cae9d1a2e2133a727b829e549d5 2025-09-07T07:51:36.8246713Z * [new tag] trunk/2a5c0785e2f975697fd7bdf1411de6e03dcaa1ef -> trunk/2a5c0785e2f975697fd7bdf1411de6e03dcaa1ef 2025-09-07T07:51:36.8248004Z * [new tag] trunk/2b8a83901c58a0858ea9e4ce00055f48e6ed164c -> trunk/2b8a83901c58a0858ea9e4ce00055f48e6ed164c 2025-09-07T07:51:36.8248867Z * [new tag] trunk/2ba65472dd54488a86a50326ea990195fc6732d6 -> trunk/2ba65472dd54488a86a50326ea990195fc6732d6 2025-09-07T07:51:36.8249957Z * [new tag] trunk/2c03f0acc53ed13fe8ebfe809129f25996e009a0 -> trunk/2c03f0acc53ed13fe8ebfe809129f25996e009a0 2025-09-07T07:51:36.8251050Z * [new tag] trunk/2dd529df0092799f68ee7afcf52338276906706a -> trunk/2dd529df0092799f68ee7afcf52338276906706a 2025-09-07T07:51:36.8252159Z * [new tag] trunk/2f6b4b1ad3f82bb3bd984f6e65744ea339ffb8b5 -> trunk/2f6b4b1ad3f82bb3bd984f6e65744ea339ffb8b5 2025-09-07T07:51:36.8253206Z * [new tag] trunk/2fa0520a64ed8aa734a56c4d124958f0b5711ca8 -> trunk/2fa0520a64ed8aa734a56c4d124958f0b5711ca8 2025-09-07T07:51:36.8254274Z * [new tag] trunk/302df2ac5dc4222294c09d48804a2dddb8f4bad8 -> trunk/302df2ac5dc4222294c09d48804a2dddb8f4bad8 2025-09-07T07:51:36.8255244Z * [new tag] trunk/33028597bfa2e0178e28c8cce33cb9b3800cac43 -> trunk/33028597bfa2e0178e28c8cce33cb9b3800cac43 2025-09-07T07:51:36.8256528Z * [new tag] trunk/34aa78274d6770086025a967fa63a86830e08176 -> trunk/34aa78274d6770086025a967fa63a86830e08176 2025-09-07T07:51:36.8257692Z * [new tag] trunk/3559c354ce6a14d11fe29fb12fa2747a2f2af449 -> trunk/3559c354ce6a14d11fe29fb12fa2747a2f2af449 2025-09-07T07:51:36.8258548Z * [new tag] trunk/36d207fcaaede0d1e58a5168084c307b32b6fd8b -> trunk/36d207fcaaede0d1e58a5168084c307b32b6fd8b 2025-09-07T07:51:36.8259520Z * [new tag] trunk/377033757ae5ca524ea842f1b0a5f446ed3d8fe0 -> trunk/377033757ae5ca524ea842f1b0a5f446ed3d8fe0 2025-09-07T07:51:36.8261678Z * [new tag] trunk/3771380f83fcac154a7c89ad679311d8c4818287 -> trunk/3771380f83fcac154a7c89ad679311d8c4818287 2025-09-07T07:51:36.8262205Z * [new tag] trunk/3a207816cc569f78863d86c01f2a3d265350e39f -> trunk/3a207816cc569f78863d86c01f2a3d265350e39f 2025-09-07T07:51:36.8263575Z * [new tag] trunk/3a20a20e7065ec927fdd216d4da3b04f879b3c67 -> trunk/3a20a20e7065ec927fdd216d4da3b04f879b3c67 2025-09-07T07:51:36.8264358Z * [new tag] trunk/3bbc2e3e4f025523eaa5dbff220b3e96bca608d0 -> trunk/3bbc2e3e4f025523eaa5dbff220b3e96bca608d0 2025-09-07T07:51:36.8265300Z * [new tag] trunk/3c0ff1b569c45cfa6935ad8031a9d4cf1551aa3f -> trunk/3c0ff1b569c45cfa6935ad8031a9d4cf1551aa3f 2025-09-07T07:51:36.8266556Z * [new tag] trunk/3c45af079afc92a03b03ddf4f9198902ffcf30cf -> trunk/3c45af079afc92a03b03ddf4f9198902ffcf30cf 2025-09-07T07:51:36.8267581Z * [new tag] trunk/3dde5d7f9bf80dd6623a712bc429e9e4302464b5 -> trunk/3dde5d7f9bf80dd6623a712bc429e9e4302464b5 2025-09-07T07:51:36.8268790Z * [new tag] trunk/403a3a393cda7e60f503f3b04b8805a845dcf45d -> trunk/403a3a393cda7e60f503f3b04b8805a845dcf45d 2025-09-07T07:51:36.8270101Z * [new tag] trunk/420c52ecf36f86d32da0853bfbe074b682b070aa -> trunk/420c52ecf36f86d32da0853bfbe074b682b070aa 2025-09-07T07:51:36.8271327Z * [new tag] trunk/43b7c86a2c0f91320f5c5f4827b111edff06fdb6 -> trunk/43b7c86a2c0f91320f5c5f4827b111edff06fdb6 2025-09-07T07:51:36.8272377Z * [new tag] trunk/451ed931562ec8b46d1f7e6c266a68132a119336 -> trunk/451ed931562ec8b46d1f7e6c266a68132a119336 2025-09-07T07:51:36.8273518Z * [new tag] trunk/480c7391126656154318fabf1d57ebc01e196e63 -> trunk/480c7391126656154318fabf1d57ebc01e196e63 2025-09-07T07:51:36.8274609Z * [new tag] trunk/48bedd753da22634aa94fbafeb731e82025404f3 -> trunk/48bedd753da22634aa94fbafeb731e82025404f3 2025-09-07T07:51:36.8276191Z * [new tag] trunk/494878a11b79071ada0b98f34042d47155be6d1c -> trunk/494878a11b79071ada0b98f34042d47155be6d1c 2025-09-07T07:51:36.8277609Z * [new tag] trunk/4ae57d448c0a7d37e4cfd5c27d977fad2cef4051 -> trunk/4ae57d448c0a7d37e4cfd5c27d977fad2cef4051 2025-09-07T07:51:36.8278608Z * [new tag] trunk/4cdaf8265d86f984254b62052da8c26ef61ef1cf -> trunk/4cdaf8265d86f984254b62052da8c26ef61ef1cf 2025-09-07T07:51:36.8279765Z * [new tag] trunk/4d4abec80f03cd8fdefe1d9cb3a60d3690cd777e -> trunk/4d4abec80f03cd8fdefe1d9cb3a60d3690cd777e 2025-09-07T07:51:36.8280859Z * [new tag] trunk/4e42aa8ffc44b8340eb0eeaf80a2cafc4763a186 -> trunk/4e42aa8ffc44b8340eb0eeaf80a2cafc4763a186 2025-09-07T07:51:36.8282008Z * [new tag] trunk/4f72d932feee0749397fec876dcd43994f50b215 -> trunk/4f72d932feee0749397fec876dcd43994f50b215 2025-09-07T07:51:36.8283171Z * [new tag] trunk/50fc22dedf3c4a27be61fa05551c4f320281b42d -> trunk/50fc22dedf3c4a27be61fa05551c4f320281b42d 2025-09-07T07:51:36.8284286Z * [new tag] trunk/5211f1f908907ffc064b56e43cf8659f7fc22aa9 -> trunk/5211f1f908907ffc064b56e43cf8659f7fc22aa9 2025-09-07T07:51:36.8286170Z * [new tag] trunk/524b78d4f67045b83bb69edc56ab16efe282971c -> trunk/524b78d4f67045b83bb69edc56ab16efe282971c 2025-09-07T07:51:36.8287348Z * [new tag] trunk/54e275e0d81fe1e1ccfa4fb5f2a5a9aaca00ca15 -> trunk/54e275e0d81fe1e1ccfa4fb5f2a5a9aaca00ca15 2025-09-07T07:51:36.8288322Z * [new tag] trunk/5561e45758d59c94605873d5db48ed459c004c3b -> trunk/5561e45758d59c94605873d5db48ed459c004c3b 2025-09-07T07:51:36.8289584Z * [new tag] trunk/57278d45f046d4f89f45d373b1af4dd56934ff24 -> trunk/57278d45f046d4f89f45d373b1af4dd56934ff24 2025-09-07T07:51:36.8290675Z * [new tag] trunk/5927a70934ccf7b70182d364c23245a7dd685503 -> trunk/5927a70934ccf7b70182d364c23245a7dd685503 2025-09-07T07:51:36.8291799Z * [new tag] trunk/5985e28912aeb40b103ebfcf2fd0665eb4a50599 -> trunk/5985e28912aeb40b103ebfcf2fd0665eb4a50599 2025-09-07T07:51:36.8292962Z * [new tag] trunk/5a2da090ed6db88bb657c4e51ec0b310cd08bff6 -> trunk/5a2da090ed6db88bb657c4e51ec0b310cd08bff6 2025-09-07T07:51:36.8294089Z * [new tag] trunk/5c473e9f5ee0ef0fc38e6cf34a95b547f8cdc8d5 -> trunk/5c473e9f5ee0ef0fc38e6cf34a95b547f8cdc8d5 2025-09-07T07:51:36.8295395Z * [new tag] trunk/5c67426d6847667a7c55a2dd01f470fa37238c18 -> trunk/5c67426d6847667a7c55a2dd01f470fa37238c18 2025-09-07T07:51:36.8296699Z * [new tag] trunk/5da573c42c332bc68d4b7946c69f690a876d951a -> trunk/5da573c42c332bc68d4b7946c69f690a876d951a 2025-09-07T07:51:36.8297804Z * [new tag] trunk/5e5870e858f60ff4bf87d03f3592097e934a9580 -> trunk/5e5870e858f60ff4bf87d03f3592097e934a9580 2025-09-07T07:51:36.8298978Z * [new tag] trunk/5f3cbc9442aa55b5afb29f4ac8ca9be569003e84 -> trunk/5f3cbc9442aa55b5afb29f4ac8ca9be569003e84 2025-09-07T07:51:36.8300086Z * [new tag] trunk/600c25e9a17fe56e3dee872be8854db08916ba0c -> trunk/600c25e9a17fe56e3dee872be8854db08916ba0c 2025-09-07T07:51:36.8301248Z * [new tag] trunk/601ae8e4831fc8123fffcfb8fd2e6b6381b42e14 -> trunk/601ae8e4831fc8123fffcfb8fd2e6b6381b42e14 2025-09-07T07:51:36.8302529Z * [new tag] trunk/6087ef41e54c2494b117ffd923faf20f515a6806 -> trunk/6087ef41e54c2494b117ffd923faf20f515a6806 2025-09-07T07:51:36.8303688Z * [new tag] trunk/626cb7df8161dd4ecb4fe43b60f37ce9076f56b1 -> trunk/626cb7df8161dd4ecb4fe43b60f37ce9076f56b1 2025-09-07T07:51:36.8304827Z * [new tag] trunk/62c3f9a97fd3dea7132a93066d32d893ffe101e6 -> trunk/62c3f9a97fd3dea7132a93066d32d893ffe101e6 2025-09-07T07:51:36.8306258Z * [new tag] trunk/63a9c23fe99eacfd09610c36dfe8f01b053c1a35 -> trunk/63a9c23fe99eacfd09610c36dfe8f01b053c1a35 2025-09-07T07:51:36.8307366Z * [new tag] trunk/65985937d97505f648b6ed852c3129f2dd08b251 -> trunk/65985937d97505f648b6ed852c3129f2dd08b251 2025-09-07T07:51:36.8309259Z * [new tag] trunk/66f3b4a682a6153517dd23369fdc3289b6494b07 -> trunk/66f3b4a682a6153517dd23369fdc3289b6494b07 2025-09-07T07:51:36.8310241Z * [new tag] trunk/6737e2c996990024187ba620d2764f3b6f6add2c -> trunk/6737e2c996990024187ba620d2764f3b6f6add2c 2025-09-07T07:51:36.8311525Z * [new tag] trunk/67c31dcd364f10072a55f4a30ffd1151c686283a -> trunk/67c31dcd364f10072a55f4a30ffd1151c686283a 2025-09-07T07:51:36.8312742Z * [new tag] trunk/68738beff73e9c3512e18b4edea811a897ce42db -> trunk/68738beff73e9c3512e18b4edea811a897ce42db 2025-09-07T07:51:36.8313914Z * [new tag] trunk/69a25f68884a168550695fdb1a7c310c54d29536 -> trunk/69a25f68884a168550695fdb1a7c310c54d29536 2025-09-07T07:51:36.8315287Z * [new tag] trunk/6b1900c22f1a07b9519346898d4c71d8a2b0f12f -> trunk/6b1900c22f1a07b9519346898d4c71d8a2b0f12f 2025-09-07T07:51:36.8316566Z * [new tag] trunk/6b8b3ac4403f771bd4a8f9a45d93347304148774 -> trunk/6b8b3ac4403f771bd4a8f9a45d93347304148774 2025-09-07T07:51:36.8317648Z * [new tag] trunk/6f7608d603834d6068b2e7a5d59bec3973b6bb1b -> trunk/6f7608d603834d6068b2e7a5d59bec3973b6bb1b 2025-09-07T07:51:36.8318830Z * [new tag] trunk/70d36e047dfb3488fd6335016711a784d810ebda -> trunk/70d36e047dfb3488fd6335016711a784d810ebda 2025-09-07T07:51:36.8319920Z * [new tag] trunk/71992dd805ff9d6763f77214dfe8b0465e88c87b -> trunk/71992dd805ff9d6763f77214dfe8b0465e88c87b 2025-09-07T07:51:36.8321120Z * [new tag] trunk/734ce8eba9c69381f187359bf0fef1d71d84cd20 -> trunk/734ce8eba9c69381f187359bf0fef1d71d84cd20 2025-09-07T07:51:36.8322254Z * [new tag] trunk/73eb4511fb863a37944342b7e92aae706de603c8 -> trunk/73eb4511fb863a37944342b7e92aae706de603c8 2025-09-07T07:51:36.8323399Z * [new tag] trunk/75bc23cfc345bd4c05e7f97c416c4b3d2d1fa64b -> trunk/75bc23cfc345bd4c05e7f97c416c4b3d2d1fa64b 2025-09-07T07:51:36.8324507Z * [new tag] trunk/771f369448321a387f2018535bc8b8b6e5f12fab -> trunk/771f369448321a387f2018535bc8b8b6e5f12fab 2025-09-07T07:51:36.8326178Z * [new tag] trunk/789d4942127143f2adcb53612c058ce4c9a2cf20 -> trunk/789d4942127143f2adcb53612c058ce4c9a2cf20 2025-09-07T07:51:36.8327221Z * [new tag] trunk/791eff96c85678c950888f9da24650083ee673fe -> trunk/791eff96c85678c950888f9da24650083ee673fe 2025-09-07T07:51:36.8328393Z * [new tag] trunk/793fc12aff1f69fbbf9f4278182fb52bbe350fc9 -> trunk/793fc12aff1f69fbbf9f4278182fb52bbe350fc9 2025-09-07T07:51:36.8329644Z * [new tag] trunk/79fcd5247a9a129eee526a14df30bfc6a22b3f01 -> trunk/79fcd5247a9a129eee526a14df30bfc6a22b3f01 2025-09-07T07:51:36.8330747Z * [new tag] trunk/7f4ff79210eb06924f223ae3a1941ee0e2635348 -> trunk/7f4ff79210eb06924f223ae3a1941ee0e2635348 2025-09-07T07:51:36.8331887Z * [new tag] trunk/8076a185c85112be62be292eb47409c88a585b1c -> trunk/8076a185c85112be62be292eb47409c88a585b1c 2025-09-07T07:51:36.8333079Z * [new tag] trunk/80dd397f1979371a5583fa3d5c7352029522a78d -> trunk/80dd397f1979371a5583fa3d5c7352029522a78d 2025-09-07T07:51:36.8334100Z * [new tag] trunk/8171d6052ec12628eb67e0040839314056014429 -> trunk/8171d6052ec12628eb67e0040839314056014429 2025-09-07T07:51:36.8335451Z * [new tag] trunk/81aeefa657b7ccc26b275c50a9f33b2f056e8071 -> trunk/81aeefa657b7ccc26b275c50a9f33b2f056e8071 2025-09-07T07:51:36.8336631Z * [new tag] trunk/81b7b16618bda250ce55982894a83dc0805eb64c -> trunk/81b7b16618bda250ce55982894a83dc0805eb64c 2025-09-07T07:51:36.8337754Z * [new tag] trunk/827f0d405448de31f79d1089f7d7fceab2f87895 -> trunk/827f0d405448de31f79d1089f7d7fceab2f87895 2025-09-07T07:51:36.8338869Z * [new tag] trunk/82f63c8f6de63c30132a8ac299b6e8c2fd0d3fe8 -> trunk/82f63c8f6de63c30132a8ac299b6e8c2fd0d3fe8 2025-09-07T07:51:36.8340032Z * [new tag] trunk/850e1382a9c56bfde18af09d3e72352d775e9435 -> trunk/850e1382a9c56bfde18af09d3e72352d775e9435 2025-09-07T07:51:36.8341385Z * [new tag] trunk/8678d831c48e616b717bff50f2d03141d2e9f965 -> trunk/8678d831c48e616b717bff50f2d03141d2e9f965 2025-09-07T07:51:36.8342640Z * [new tag] trunk/869cbcc16e489a4f5a14a93d5779b0ea86061c60 -> trunk/869cbcc16e489a4f5a14a93d5779b0ea86061c60 2025-09-07T07:51:36.8343840Z * [new tag] trunk/8703debf669bc2238211bfd039f4ecdd8228b7f7 -> trunk/8703debf669bc2238211bfd039f4ecdd8228b7f7 2025-09-07T07:51:36.8345132Z * [new tag] trunk/874069fbe46e82da5cfa405e6c0deb12e89ff608 -> trunk/874069fbe46e82da5cfa405e6c0deb12e89ff608 2025-09-07T07:51:36.8346552Z * [new tag] trunk/8875d6e394da2fffd04f31b28bf258c94d4776a3 -> trunk/8875d6e394da2fffd04f31b28bf258c94d4776a3 2025-09-07T07:51:36.8347834Z * [new tag] trunk/88d94d17e8c5155451393afa6eb3bab48ab61c16 -> trunk/88d94d17e8c5155451393afa6eb3bab48ab61c16 2025-09-07T07:51:36.8349220Z * [new tag] trunk/890626632def7e0ef95a2d01e87a0e4627824a9f -> trunk/890626632def7e0ef95a2d01e87a0e4627824a9f 2025-09-07T07:51:36.8350651Z * [new tag] trunk/8975cda2520b7b1b5bc3b4d8213edf261fa82570 -> trunk/8975cda2520b7b1b5bc3b4d8213edf261fa82570 2025-09-07T07:51:36.8351765Z * [new tag] trunk/89d41d3f61d04f14730ec26f008a59bef6624610 -> trunk/89d41d3f61d04f14730ec26f008a59bef6624610 2025-09-07T07:51:36.8352981Z * [new tag] trunk/8bb213b6d599ef1273fe52f9b1f6d476056c3a41 -> trunk/8bb213b6d599ef1273fe52f9b1f6d476056c3a41 2025-09-07T07:51:36.8354223Z * [new tag] trunk/8e23a1227b5fb2e39afaa7d57c075a75b640a5af -> trunk/8e23a1227b5fb2e39afaa7d57c075a75b640a5af 2025-09-07T07:51:36.8355991Z * [new tag] trunk/8ec551bb354ab2b85fbbba9d461740a20366d248 -> trunk/8ec551bb354ab2b85fbbba9d461740a20366d248 2025-09-07T07:51:36.8357392Z * [new tag] trunk/8fd3c9ce919c8d5c645fd348bba517e948cbc29d -> trunk/8fd3c9ce919c8d5c645fd348bba517e948cbc29d 2025-09-07T07:51:36.8358573Z * [new tag] trunk/90f50f7e68e120d9574e6e3189e37b4280010ad9 -> trunk/90f50f7e68e120d9574e6e3189e37b4280010ad9 2025-09-07T07:51:36.8359707Z * [new tag] trunk/91f0bcf43fc0bc743350d491ac63b77e92054ac9 -> trunk/91f0bcf43fc0bc743350d491ac63b77e92054ac9 2025-09-07T07:51:36.8361006Z * [new tag] trunk/92576a594b8121f6b0b1b5a3ea16d08792fc68ab -> trunk/92576a594b8121f6b0b1b5a3ea16d08792fc68ab 2025-09-07T07:51:36.8362105Z * [new tag] trunk/92a43025e0baa1f2ce345f28d22913b518a1ab9d -> trunk/92a43025e0baa1f2ce345f28d22913b518a1ab9d 2025-09-07T07:51:36.8363392Z * [new tag] trunk/93fb23d6fae7c4e82c4239a1033e522088742634 -> trunk/93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:51:36.8364586Z * [new tag] trunk/9458d1ac3bd70c2af316a8ba95d2c6c9c1199c9c -> trunk/9458d1ac3bd70c2af316a8ba95d2c6c9c1199c9c 2025-09-07T07:51:36.8365972Z * [new tag] trunk/9480cdc0b61488c89a23c2f64f43b2dcedc8728e -> trunk/9480cdc0b61488c89a23c2f64f43b2dcedc8728e 2025-09-07T07:51:36.8367618Z * [new tag] trunk/9491d289b329e4ba4a9f5f5b1be7960671bb7840 -> trunk/9491d289b329e4ba4a9f5f5b1be7960671bb7840 2025-09-07T07:51:36.8368751Z * [new tag] trunk/9499c8761cd2067feb9877414e818f6fd00290f1 -> trunk/9499c8761cd2067feb9877414e818f6fd00290f1 2025-09-07T07:51:36.8369899Z * [new tag] trunk/95ee0bfea99d3d346d6502b91b497d2b35795504 -> trunk/95ee0bfea99d3d346d6502b91b497d2b35795504 2025-09-07T07:51:36.8371012Z * [new tag] trunk/98374612fc2febd686be20761e56bdc2424bc36a -> trunk/98374612fc2febd686be20761e56bdc2424bc36a 2025-09-07T07:51:36.8372311Z * [new tag] trunk/98efc9e93d8fc61eb53cb91378443617cb550500 -> trunk/98efc9e93d8fc61eb53cb91378443617cb550500 2025-09-07T07:51:36.8373491Z * [new tag] trunk/994f2a5dbcbdc915da39bf6f6ce4d1f5e74835c9 -> trunk/994f2a5dbcbdc915da39bf6f6ce4d1f5e74835c9 2025-09-07T07:51:36.8374786Z * [new tag] trunk/99f356fa58c8d726cef022d8710f5491291158f6 -> trunk/99f356fa58c8d726cef022d8710f5491291158f6 2025-09-07T07:51:36.8375965Z * [new tag] trunk/9a1c5c0a078b94d13ac5c1ae0d754d19fb73bf99 -> trunk/9a1c5c0a078b94d13ac5c1ae0d754d19fb73bf99 2025-09-07T07:51:36.8377125Z * [new tag] trunk/9a665ca3c472384e9d722bddba79e5a7680f1abd -> trunk/9a665ca3c472384e9d722bddba79e5a7680f1abd 2025-09-07T07:51:36.8378296Z * [new tag] trunk/9aedb3cd87b52160872173c177f61053d97bed57 -> trunk/9aedb3cd87b52160872173c177f61053d97bed57 2025-09-07T07:51:36.8379398Z * [new tag] trunk/9b81fe281da41f2421506339d26b027a468902f4 -> trunk/9b81fe281da41f2421506339d26b027a468902f4 2025-09-07T07:51:36.8380498Z * [new tag] trunk/9bdcee01f86e2969cff1140cdecfca13cb51816e -> trunk/9bdcee01f86e2969cff1140cdecfca13cb51816e 2025-09-07T07:51:36.8381680Z * [new tag] trunk/9c03d6be87eedc06e524e202e07a7e776551a839 -> trunk/9c03d6be87eedc06e524e202e07a7e776551a839 2025-09-07T07:51:36.8382914Z * [new tag] trunk/9c957723a0fedd9c637e63e023a613019e2cab60 -> trunk/9c957723a0fedd9c637e63e023a613019e2cab60 2025-09-07T07:51:36.8384045Z * [new tag] trunk/9e5247f51d81735e5f1e65e80588985fa93bccc5 -> trunk/9e5247f51d81735e5f1e65e80588985fa93bccc5 2025-09-07T07:51:36.8385444Z * [new tag] trunk/9eadb37cdd699f7e8e8177a5227bfeb16184ef26 -> trunk/9eadb37cdd699f7e8e8177a5227bfeb16184ef26 2025-09-07T07:51:36.8386740Z * [new tag] trunk/a00cdc1e4159db73c9ffb3f25e93e55877709a29 -> trunk/a00cdc1e4159db73c9ffb3f25e93e55877709a29 2025-09-07T07:51:36.8387832Z * [new tag] trunk/a02ee4a816d11380c6f564c1aba64d56af5ba705 -> trunk/a02ee4a816d11380c6f564c1aba64d56af5ba705 2025-09-07T07:51:36.8388899Z * [new tag] trunk/a3c7f77e50f900721817934120d60c2361b3c40d -> trunk/a3c7f77e50f900721817934120d60c2361b3c40d 2025-09-07T07:51:36.8390101Z * [new tag] trunk/a3d72b09ae12126a2b7d4a63a45ac100a882a802 -> trunk/a3d72b09ae12126a2b7d4a63a45ac100a882a802 2025-09-07T07:51:36.8391295Z * [new tag] trunk/a3e5466002791da609fcb069155d8ee347baee92 -> trunk/a3e5466002791da609fcb069155d8ee347baee92 2025-09-07T07:51:36.8392412Z * [new tag] trunk/a714437093ed196eee28f7de454cf4c41badc098 -> trunk/a714437093ed196eee28f7de454cf4c41badc098 2025-09-07T07:51:36.8393553Z * [new tag] trunk/a75e8cd27098f290de0b7439685d05ce02e91356 -> trunk/a75e8cd27098f290de0b7439685d05ce02e91356 2025-09-07T07:51:36.8394558Z * [new tag] trunk/a8d6943d36c1c2a5f90d3573460695bad4b623ae -> trunk/a8d6943d36c1c2a5f90d3573460695bad4b623ae 2025-09-07T07:51:36.8395903Z * [new tag] trunk/a918bbad6ab20649ff82eefb48417ecbe96bcb34 -> trunk/a918bbad6ab20649ff82eefb48417ecbe96bcb34 2025-09-07T07:51:36.8397111Z * [new tag] trunk/a99d8d39bc842d6ebc3e368b178e4884d24b056e -> trunk/a99d8d39bc842d6ebc3e368b178e4884d24b056e 2025-09-07T07:51:36.8398402Z * [new tag] trunk/aac1a50a191b4102d566c9c1ea22f06d6c2e3f02 -> trunk/aac1a50a191b4102d566c9c1ea22f06d6c2e3f02 2025-09-07T07:51:36.8399753Z * [new tag] trunk/aad96a202244c7d0d120c04ba8db593edd8c0f92 -> trunk/aad96a202244c7d0d120c04ba8db593edd8c0f92 2025-09-07T07:51:36.8400905Z * [new tag] trunk/ab643e4dbbaf7b663d4237514cbf01af9b11565c -> trunk/ab643e4dbbaf7b663d4237514cbf01af9b11565c 2025-09-07T07:51:36.8402040Z * [new tag] trunk/abc447174cd2cf8591edbc70a9f836f9a5779f47 -> trunk/abc447174cd2cf8591edbc70a9f836f9a5779f47 2025-09-07T07:51:36.8403215Z * [new tag] trunk/acece97c3a9dceb63194e314da93fdf37cf15a0d -> trunk/acece97c3a9dceb63194e314da93fdf37cf15a0d 2025-09-07T07:51:36.8404283Z * [new tag] trunk/ada43ed39c80b746b4822c92640a1882619e2795 -> trunk/ada43ed39c80b746b4822c92640a1882619e2795 2025-09-07T07:51:36.8405548Z * [new tag] trunk/adae7f66aacf3f248c3101b858cf98d5809119fa -> trunk/adae7f66aacf3f248c3101b858cf98d5809119fa 2025-09-07T07:51:36.8406948Z * [new tag] trunk/ae0edc133e61e3b16caf0b2ee0ff3f33ab72af4c -> trunk/ae0edc133e61e3b16caf0b2ee0ff3f33ab72af4c 2025-09-07T07:51:36.8407978Z * [new tag] trunk/aed33a8fcbd60b052d4559d261390c5797129c6d -> trunk/aed33a8fcbd60b052d4559d261390c5797129c6d 2025-09-07T07:51:36.8409117Z * [new tag] trunk/b04e922712080a3652e438d05e8bb74e0cd2d238 -> trunk/b04e922712080a3652e438d05e8bb74e0cd2d238 2025-09-07T07:51:36.8410278Z * [new tag] trunk/b0a3e58dd71c1a039ac0ef51e5bd8f704f632f6f -> trunk/b0a3e58dd71c1a039ac0ef51e5bd8f704f632f6f 2025-09-07T07:51:36.8411445Z * [new tag] trunk/b16d3f4c8c01d461c2f01064e9ca5fa2b33f5cf1 -> trunk/b16d3f4c8c01d461c2f01064e9ca5fa2b33f5cf1 2025-09-07T07:51:36.8412525Z * [new tag] trunk/b18bb6796f210a183e687d9d64984a5a9d13cf09 -> trunk/b18bb6796f210a183e687d9d64984a5a9d13cf09 2025-09-07T07:51:36.8413738Z * [new tag] trunk/b1bb98ddebdd3e41bf7987372409bdce96ae55de -> trunk/b1bb98ddebdd3e41bf7987372409bdce96ae55de 2025-09-07T07:51:36.8414897Z * [new tag] trunk/b2b4add0e754411372060e1d7b4057a66439172b -> trunk/b2b4add0e754411372060e1d7b4057a66439172b 2025-09-07T07:51:36.8416374Z * [new tag] trunk/b2c7b9ad2dc5a7c0b61febd307761bd5bc2f0f05 -> trunk/b2c7b9ad2dc5a7c0b61febd307761bd5bc2f0f05 2025-09-07T07:51:36.8417559Z * [new tag] trunk/b40d9432be44a6b5974ee62e7d19c3c61c5ece37 -> trunk/b40d9432be44a6b5974ee62e7d19c3c61c5ece37 2025-09-07T07:51:36.8418688Z * [new tag] trunk/b4ad38279b178b7bd14355123c1101e2e853e77b -> trunk/b4ad38279b178b7bd14355123c1101e2e853e77b 2025-09-07T07:51:36.8419830Z * [new tag] trunk/b67c41039835bd9b20b83cd6233e86baaa5f5dde -> trunk/b67c41039835bd9b20b83cd6233e86baaa5f5dde 2025-09-07T07:51:36.8421171Z * [new tag] trunk/b6d0a9ea9056ede4f7024dbf3bd6c43be3aff49c -> trunk/b6d0a9ea9056ede4f7024dbf3bd6c43be3aff49c 2025-09-07T07:51:36.8422402Z * [new tag] trunk/b7dad7dd49448c88d0751fa2e29c70afe985f734 -> trunk/b7dad7dd49448c88d0751fa2e29c70afe985f734 2025-09-07T07:51:36.8424017Z * [new tag] trunk/b7e207ca9f046ddd716076965a0cce403ba99052 -> trunk/b7e207ca9f046ddd716076965a0cce403ba99052 2025-09-07T07:51:36.8425302Z * [new tag] trunk/b919560c4a7010e2d89facee25586269a994746e -> trunk/b919560c4a7010e2d89facee25586269a994746e 2025-09-07T07:51:36.8426680Z * [new tag] trunk/b9ba612f7a968f7b27e121ca8f4d0a4d954f5354 -> trunk/b9ba612f7a968f7b27e121ca8f4d0a4d954f5354 2025-09-07T07:51:36.8427807Z * [new tag] trunk/ba7f546ccccb5e0b36d9070dc25f26a9647f89f8 -> trunk/ba7f546ccccb5e0b36d9070dc25f26a9647f89f8 2025-09-07T07:51:36.8428952Z * [new tag] trunk/bb950284c7e72905994bc25dd436c10e48088d85 -> trunk/bb950284c7e72905994bc25dd436c10e48088d85 2025-09-07T07:51:36.8430133Z * [new tag] trunk/bbedc71fd3267c639c38b4ec25eaa22f973d9c4d -> trunk/bbedc71fd3267c639c38b4ec25eaa22f973d9c4d 2025-09-07T07:51:36.8431143Z * [new tag] trunk/bc4db2c27fce6ff1648bdc5af31ec225d2a31f37 -> trunk/bc4db2c27fce6ff1648bdc5af31ec225d2a31f37 2025-09-07T07:51:36.8432202Z * [new tag] trunk/bc505977fb66677a09c31155c987330fbb18a865 -> trunk/bc505977fb66677a09c31155c987330fbb18a865 2025-09-07T07:51:36.8433348Z * [new tag] trunk/bd39e47feea7326afb5bbb67fcb1e69279239527 -> trunk/bd39e47feea7326afb5bbb67fcb1e69279239527 2025-09-07T07:51:36.8434590Z * [new tag] trunk/be5b03dde96638f25ffd732a4fed7e41b4cf40e1 -> trunk/be5b03dde96638f25ffd732a4fed7e41b4cf40e1 2025-09-07T07:51:36.8436117Z * [new tag] trunk/bffc7dd1f374d8408911cd22c6b3d6df39ded9b3 -> trunk/bffc7dd1f374d8408911cd22c6b3d6df39ded9b3 2025-09-07T07:51:36.8437243Z * [new tag] trunk/c024b1f5a18d5c5aee5cc2acdd4c52b24b93ffcf -> trunk/c024b1f5a18d5c5aee5cc2acdd4c52b24b93ffcf 2025-09-07T07:51:36.8438545Z * [new tag] trunk/c0983e6cc0acf71689e1851d12609e00b3f59371 -> trunk/c0983e6cc0acf71689e1851d12609e00b3f59371 2025-09-07T07:51:36.8439646Z * [new tag] trunk/c10195e723eeeedd099ed8b73eda7184ca618fad -> trunk/c10195e723eeeedd099ed8b73eda7184ca618fad 2025-09-07T07:51:36.8440711Z * [new tag] trunk/c157cf6488ade6a7ee2ce2d25b059e1335630a99 -> trunk/c157cf6488ade6a7ee2ce2d25b059e1335630a99 2025-09-07T07:51:36.8441816Z * [new tag] trunk/c2a30246172fd71d56529907ffd3c27b76b1f3a7 -> trunk/c2a30246172fd71d56529907ffd3c27b76b1f3a7 2025-09-07T07:51:36.8442911Z * [new tag] trunk/c32111149921b48bfef909293f1049e21619ed76 -> trunk/c32111149921b48bfef909293f1049e21619ed76 2025-09-07T07:51:36.8443932Z * [new tag] trunk/c37103234afc832dcad307e9016230810957c9d5 -> trunk/c37103234afc832dcad307e9016230810957c9d5 2025-09-07T07:51:36.8445272Z * [new tag] trunk/c3ceca2995cd35e1376c4b0704669bff1a81e836 -> trunk/c3ceca2995cd35e1376c4b0704669bff1a81e836 2025-09-07T07:51:36.8446573Z * [new tag] trunk/c3d54dea9febb1236d48d19e5d4876a63f2e20fd -> trunk/c3d54dea9febb1236d48d19e5d4876a63f2e20fd 2025-09-07T07:51:36.8447697Z * [new tag] trunk/c465b3d52c5687fe910d35a5c75341b77f821741 -> trunk/c465b3d52c5687fe910d35a5c75341b77f821741 2025-09-07T07:51:36.8448837Z * [new tag] trunk/c5b8a10be5e89396da916d1069ffcb7135f0372b -> trunk/c5b8a10be5e89396da916d1069ffcb7135f0372b 2025-09-07T07:51:36.8449852Z * [new tag] trunk/c7e41071a08f4045bc11ab60ec366d7357d56e30 -> trunk/c7e41071a08f4045bc11ab60ec366d7357d56e30 2025-09-07T07:51:36.8451033Z * [new tag] trunk/c98ddaca6d2e19ca37aff00c4ff0cda1e9a6ff65 -> trunk/c98ddaca6d2e19ca37aff00c4ff0cda1e9a6ff65 2025-09-07T07:51:36.8452176Z * [new tag] trunk/cb1e31362c7b53acf4ac95b9f8878064c184f03b -> trunk/cb1e31362c7b53acf4ac95b9f8878064c184f03b 2025-09-07T07:51:36.8453268Z * [new tag] trunk/cbfb005f7cce79974795b148e265f594f59477c8 -> trunk/cbfb005f7cce79974795b148e265f594f59477c8 2025-09-07T07:51:36.8454438Z * [new tag] trunk/cc5bdd12401bda835291d2f3cb297132ebdbf358 -> trunk/cc5bdd12401bda835291d2f3cb297132ebdbf358 2025-09-07T07:51:36.8455955Z * [new tag] trunk/cd529b686d54bbaa443f5b310140de48422d96c7 -> trunk/cd529b686d54bbaa443f5b310140de48422d96c7 2025-09-07T07:51:36.8457114Z * [new tag] trunk/cec0ff122815582af5302360aff03676558c5c87 -> trunk/cec0ff122815582af5302360aff03676558c5c87 2025-09-07T07:51:36.8458448Z * [new tag] trunk/d11720efdb563d02cf4f7d324311fb15a755268e -> trunk/d11720efdb563d02cf4f7d324311fb15a755268e 2025-09-07T07:51:36.8459732Z * [new tag] trunk/d1706d9128ae24d9048167e80d3fe5196d19035e -> trunk/d1706d9128ae24d9048167e80d3fe5196d19035e 2025-09-07T07:51:36.8461069Z * [new tag] trunk/d1a15abfdcaef138f2d9e93a9f46be44f30b766d -> trunk/d1a15abfdcaef138f2d9e93a9f46be44f30b766d 2025-09-07T07:51:36.8462508Z * [new tag] trunk/d232a95d4a79404ca05c1f52d37fde7339dcdf49 -> trunk/d232a95d4a79404ca05c1f52d37fde7339dcdf49 2025-09-07T07:51:36.8463618Z * [new tag] trunk/d2d4c8e9b2371c9aacfb771d9402ac7427b9778e -> trunk/d2d4c8e9b2371c9aacfb771d9402ac7427b9778e 2025-09-07T07:51:36.8464795Z * [new tag] trunk/d33840c542b387ab08ba49aa6c45aa9567fd9be7 -> trunk/d33840c542b387ab08ba49aa6c45aa9567fd9be7 2025-09-07T07:51:36.8466176Z * [new tag] trunk/d5643e8f3a648a99636bfa1f2a41d54bd3c0d0f1 -> trunk/d5643e8f3a648a99636bfa1f2a41d54bd3c0d0f1 2025-09-07T07:51:36.8467397Z * [new tag] trunk/d5b38410b5b6cf75c7a7389972777a6497926ee7 -> trunk/d5b38410b5b6cf75c7a7389972777a6497926ee7 2025-09-07T07:51:36.8468373Z * [new tag] trunk/d5e0f4202ba14632e4d14862ace096609e763462 -> trunk/d5e0f4202ba14632e4d14862ace096609e763462 2025-09-07T07:51:36.8469814Z * [new tag] trunk/d636c181f9140a7b59be10b36eae23039fc2bb72 -> trunk/d636c181f9140a7b59be10b36eae23039fc2bb72 2025-09-07T07:51:36.8471780Z * [new tag] trunk/d64718503728001a1e78168fd7f2d4ff23e57285 -> trunk/d64718503728001a1e78168fd7f2d4ff23e57285 2025-09-07T07:51:36.8472897Z * [new tag] trunk/d67c29ad22670320d676b02e394274af34e8e643 -> trunk/d67c29ad22670320d676b02e394274af34e8e643 2025-09-07T07:51:36.8474056Z * [new tag] trunk/d6b74568e2c98ce58ecc145b72ac66d4caf7ce95 -> trunk/d6b74568e2c98ce58ecc145b72ac66d4caf7ce95 2025-09-07T07:51:36.8475309Z * [new tag] trunk/d711f27845abd45007ccab6076649ebd896c2661 -> trunk/d711f27845abd45007ccab6076649ebd896c2661 2025-09-07T07:51:36.8476514Z * [new tag] trunk/d9d6dde0f42d4bcc8c97671ac50d5096c7e500ab -> trunk/d9d6dde0f42d4bcc8c97671ac50d5096c7e500ab 2025-09-07T07:51:36.8477710Z * [new tag] trunk/da4db4b33d1fdd046650cf19fdbac581a19bf2f9 -> trunk/da4db4b33d1fdd046650cf19fdbac581a19bf2f9 2025-09-07T07:51:36.8478720Z * [new tag] trunk/dac8a4b91c01c3bbc96f54e621b1ea4ffdbd29d1 -> trunk/dac8a4b91c01c3bbc96f54e621b1ea4ffdbd29d1 2025-09-07T07:51:36.8479934Z * [new tag] trunk/dbec08729fb9848bebed6048c63831b87170d061 -> trunk/dbec08729fb9848bebed6048c63831b87170d061 2025-09-07T07:51:36.8480955Z * [new tag] trunk/dcf385395d838f38c8dca25913578230dd43099a -> trunk/dcf385395d838f38c8dca25913578230dd43099a 2025-09-07T07:51:36.8482063Z * [new tag] trunk/dd2519abe83ec3c40d4797492434e41fe3b47e17 -> trunk/dd2519abe83ec3c40d4797492434e41fe3b47e17 2025-09-07T07:51:36.8483211Z * [new tag] trunk/dec72ea4b006dd0fbcaaaa106ad273d73807ab9d -> trunk/dec72ea4b006dd0fbcaaaa106ad273d73807ab9d 2025-09-07T07:51:36.8484359Z * [new tag] trunk/e0a62b266c021b910ce6dc02a6c9429210487717 -> trunk/e0a62b266c021b910ce6dc02a6c9429210487717 2025-09-07T07:51:36.8485824Z * [new tag] trunk/e19e02c84c9dcc408375e5cae3b0709c18b99228 -> trunk/e19e02c84c9dcc408375e5cae3b0709c18b99228 2025-09-07T07:51:36.8487151Z * [new tag] trunk/e304ea4e69d3a7deeb7e48c7450c214a4c953937 -> trunk/e304ea4e69d3a7deeb7e48c7450c214a4c953937 2025-09-07T07:51:36.8488284Z * [new tag] trunk/e3068cdb446adefb5a875616ba37a60235391439 -> trunk/e3068cdb446adefb5a875616ba37a60235391439 2025-09-07T07:51:36.8489393Z * [new tag] trunk/e381d4b0205d5f126c1de534f867ba776f7c3ee6 -> trunk/e381d4b0205d5f126c1de534f867ba776f7c3ee6 2025-09-07T07:51:36.8490695Z * [new tag] trunk/e4bd0ff4f8981b805df32ea5b3550621965ea4f2 -> trunk/e4bd0ff4f8981b805df32ea5b3550621965ea4f2 2025-09-07T07:51:36.8491695Z * [new tag] trunk/e532c9d4f1cdcbc1ea9628f55b9813e77847bdc7 -> trunk/e532c9d4f1cdcbc1ea9628f55b9813e77847bdc7 2025-09-07T07:51:36.8492807Z * [new tag] trunk/e92cd9415377403b6e90585e764639e2e0b5973b -> trunk/e92cd9415377403b6e90585e764639e2e0b5973b 2025-09-07T07:51:36.8494001Z * [new tag] trunk/e9481b6617b5576b099d8ca5798111592e9ad090 -> trunk/e9481b6617b5576b099d8ca5798111592e9ad090 2025-09-07T07:51:36.8495160Z * [new tag] trunk/ea1883dfd3e42defe37b11202b878bb76defa087 -> trunk/ea1883dfd3e42defe37b11202b878bb76defa087 2025-09-07T07:51:36.8496402Z * [new tag] trunk/eac3d6f04cfbbebe3d470dacd216da7d4b1f95a8 -> trunk/eac3d6f04cfbbebe3d470dacd216da7d4b1f95a8 2025-09-07T07:51:36.8497572Z * [new tag] trunk/eb18d32bda75189494d955aa001ade15f10333de -> trunk/eb18d32bda75189494d955aa001ade15f10333de 2025-09-07T07:51:36.8498820Z * [new tag] trunk/ef3be6726f7ff4b77c22db10cec5b686f9107ea9 -> trunk/ef3be6726f7ff4b77c22db10cec5b686f9107ea9 2025-09-07T07:51:36.8500119Z * [new tag] trunk/ef8aabd42422725026cb4dbf48aafa9efa226a04 -> trunk/ef8aabd42422725026cb4dbf48aafa9efa226a04 2025-09-07T07:51:36.8501406Z * [new tag] trunk/f00445b43eee57e20bb9316fa796ca23bf73373b -> trunk/f00445b43eee57e20bb9316fa796ca23bf73373b 2025-09-07T07:51:36.8502938Z * [new tag] trunk/f0c391102b754e3b145e8c59231d2df563487e37 -> trunk/f0c391102b754e3b145e8c59231d2df563487e37 2025-09-07T07:51:36.8503989Z * [new tag] trunk/f27985b7e796fb66a1b476284ba42d8cb360a751 -> trunk/f27985b7e796fb66a1b476284ba42d8cb360a751 2025-09-07T07:51:36.8505344Z * [new tag] trunk/f36f285953700f971552083a5da9d0ceacb63bbd -> trunk/f36f285953700f971552083a5da9d0ceacb63bbd 2025-09-07T07:51:36.8506559Z * [new tag] trunk/f3cebec39ebc110e1c8b06e741896585f7892dbb -> trunk/f3cebec39ebc110e1c8b06e741896585f7892dbb 2025-09-07T07:51:36.8507579Z * [new tag] trunk/f4c33cd44acac92c0b451a04da20ebe9370e5b0c -> trunk/f4c33cd44acac92c0b451a04da20ebe9370e5b0c 2025-09-07T07:51:36.8508912Z * [new tag] trunk/f612045ce105f008b2b675e2fc870163babeb2e8 -> trunk/f612045ce105f008b2b675e2fc870163babeb2e8 2025-09-07T07:51:36.8510341Z * [new tag] trunk/f8746b878dfc1e9639d42cbde832e9b9e792c86c -> trunk/f8746b878dfc1e9639d42cbde832e9b9e792c86c 2025-09-07T07:51:36.8511491Z * [new tag] trunk/f8ffa9194e26523e5f976d4a824d5cc58922727c -> trunk/f8ffa9194e26523e5f976d4a824d5cc58922727c 2025-09-07T07:51:36.8512598Z * [new tag] trunk/f981a7fa5230b98974291fdde32fe8488bc5d469 -> trunk/f981a7fa5230b98974291fdde32fe8488bc5d469 2025-09-07T07:51:36.8513820Z * [new tag] trunk/fbf3d2027daabbcb44d0af274b139be2a248a4f7 -> trunk/fbf3d2027daabbcb44d0af274b139be2a248a4f7 2025-09-07T07:51:36.8515256Z * [new tag] trunk/fca2601c9d628e1bd2d75c7318cd22c4e8c832aa -> trunk/fca2601c9d628e1bd2d75c7318cd22c4e8c832aa 2025-09-07T07:51:36.8516577Z * [new tag] trunk/fea20775ad96bdca972a1811d7d3372f368614ab -> trunk/fea20775ad96bdca972a1811d7d3372f368614ab 2025-09-07T07:51:36.8517953Z * [new tag] trunk/fefee081642f87419a21dc852f7167d4640443cd -> trunk/fefee081642f87419a21dc852f7167d4640443cd 2025-09-07T07:51:36.8519008Z * [new tag] v0.1.1 -> v0.1.1 2025-09-07T07:51:36.8520109Z * [new tag] v0.1.10 -> v0.1.10 2025-09-07T07:51:36.8521047Z * [new tag] v0.1.11 -> v0.1.11 2025-09-07T07:51:36.8522067Z * [new tag] v0.1.12 -> v0.1.12 2025-09-07T07:51:36.8522987Z * [new tag] v0.1.2 -> v0.1.2 2025-09-07T07:51:36.8523944Z * [new tag] v0.1.3 -> v0.1.3 2025-09-07T07:51:36.8525074Z * [new tag] v0.1.4 -> v0.1.4 2025-09-07T07:51:36.8526608Z * [new tag] v0.1.5 -> v0.1.5 2025-09-07T07:51:36.8527627Z * [new tag] v0.1.6 -> v0.1.6 2025-09-07T07:51:36.8528616Z * [new tag] v0.1.7 -> v0.1.7 2025-09-07T07:51:36.8529647Z * [new tag] v0.1.8 -> v0.1.8 2025-09-07T07:51:36.8530621Z * [new tag] v0.1.9 -> v0.1.9 2025-09-07T07:51:36.8531666Z * [new tag] v0.2.0 -> v0.2.0 2025-09-07T07:51:36.8532723Z * [new tag] v0.3.0 -> v0.3.0 2025-09-07T07:51:36.8533816Z * [new tag] v0.3.1 -> v0.3.1 2025-09-07T07:51:36.8534889Z * [new tag] v0.4.0 -> v0.4.0 2025-09-07T07:51:36.8536199Z * [new tag] v0.4.1 -> v0.4.1 2025-09-07T07:51:36.8537182Z * [new tag] v1.0.0 -> v1.0.0 2025-09-07T07:51:36.8538319Z * [new tag] v1.0.0a0 -> v1.0.0a0 2025-09-07T07:51:36.8539366Z * [new tag] v1.0.1 -> v1.0.1 2025-09-07T07:51:36.8540428Z * [new tag] v1.0rc0 -> v1.0rc0 2025-09-07T07:51:36.8541354Z * [new tag] v1.0rc1 -> v1.0rc1 2025-09-07T07:51:36.8542866Z * [new tag] v1.1.0 -> v1.1.0 2025-09-07T07:51:36.8543773Z * [new tag] v1.1.0a0 -> v1.1.0a0 2025-09-07T07:51:36.8545115Z * [new tag] v1.10.0 -> v1.10.0 2025-09-07T07:51:36.8546469Z * [new tag] v1.10.0-rc1 -> v1.10.0-rc1 2025-09-07T07:51:36.8547578Z * [new tag] v1.10.0-rc2 -> v1.10.0-rc2 2025-09-07T07:51:36.8548450Z * [new tag] v1.10.0-rc3 -> v1.10.0-rc3 2025-09-07T07:51:36.8549588Z * [new tag] v1.10.1 -> v1.10.1 2025-09-07T07:51:36.8550468Z * [new tag] v1.10.1-rc1 -> v1.10.1-rc1 2025-09-07T07:51:36.8551256Z * [new tag] v1.10.2 -> v1.10.2 2025-09-07T07:51:36.8552167Z * [new tag] v1.10.2-rc1 -> v1.10.2-rc1 2025-09-07T07:51:36.8553271Z * [new tag] v1.11.0 -> v1.11.0 2025-09-07T07:51:36.8554348Z * [new tag] v1.11.0-rc1 -> v1.11.0-rc1 2025-09-07T07:51:36.8555719Z * [new tag] v1.11.0-rc2 -> v1.11.0-rc2 2025-09-07T07:51:36.8556987Z * [new tag] v1.11.0-rc3 -> v1.11.0-rc3 2025-09-07T07:51:36.8558230Z * [new tag] v1.11.0-rc4 -> v1.11.0-rc4 2025-09-07T07:51:36.8559581Z * [new tag] v1.11.0-rc5 -> v1.11.0-rc5 2025-09-07T07:51:36.8560447Z * [new tag] v1.11.0-rc6 -> v1.11.0-rc6 2025-09-07T07:51:36.8561340Z * [new tag] v1.11.0-rc7 -> v1.11.0-rc7 2025-09-07T07:51:36.8562390Z * [new tag] v1.12.0 -> v1.12.0 2025-09-07T07:51:36.8563530Z * [new tag] v1.12.0-rc1 -> v1.12.0-rc1 2025-09-07T07:51:36.8564630Z * [new tag] v1.12.0-rc2 -> v1.12.0-rc2 2025-09-07T07:51:36.8565984Z * [new tag] v1.12.0-rc3 -> v1.12.0-rc3 2025-09-07T07:51:36.8567071Z * [new tag] v1.12.0-rc4 -> v1.12.0-rc4 2025-09-07T07:51:36.8568320Z * [new tag] v1.12.0-rc5 -> v1.12.0-rc5 2025-09-07T07:51:36.8569658Z * [new tag] v1.12.0-rc6 -> v1.12.0-rc6 2025-09-07T07:51:36.8570645Z * [new tag] v1.12.0-rc7 -> v1.12.0-rc7 2025-09-07T07:51:36.8571533Z * [new tag] v1.12.0-rc8 -> v1.12.0-rc8 2025-09-07T07:51:36.8572530Z * [new tag] v1.12.1 -> v1.12.1 2025-09-07T07:51:36.8573738Z * [new tag] v1.12.1-rc1 -> v1.12.1-rc1 2025-09-07T07:51:36.8574891Z * [new tag] v1.12.1-rc2 -> v1.12.1-rc2 2025-09-07T07:51:36.8576280Z * [new tag] v1.12.1-rc3 -> v1.12.1-rc3 2025-09-07T07:51:36.8577381Z * [new tag] v1.12.1-rc4 -> v1.12.1-rc4 2025-09-07T07:51:36.8578282Z * [new tag] v1.12.1-rc5 -> v1.12.1-rc5 2025-09-07T07:51:36.8579442Z * [new tag] v1.13.0 -> v1.13.0 2025-09-07T07:51:36.8580499Z * [new tag] v1.13.0-rc1 -> v1.13.0-rc1 2025-09-07T07:51:36.8581638Z * [new tag] v1.13.0-rc2 -> v1.13.0-rc2 2025-09-07T07:51:36.8582872Z * [new tag] v1.13.0-rc3 -> v1.13.0-rc3 2025-09-07T07:51:36.8584084Z * [new tag] v1.13.0-rc4 -> v1.13.0-rc4 2025-09-07T07:51:36.8585072Z * [new tag] v1.13.0-rc5 -> v1.13.0-rc5 2025-09-07T07:51:36.8586099Z * [new tag] v1.13.0-rc6 -> v1.13.0-rc6 2025-09-07T07:51:36.8587168Z * [new tag] v1.13.1 -> v1.13.1 2025-09-07T07:51:36.8588376Z * [new tag] v1.13.1-rc1 -> v1.13.1-rc1 2025-09-07T07:51:36.8589354Z * [new tag] v1.2.0 -> v1.2.0 2025-09-07T07:51:36.8590434Z * [new tag] v1.2.0a0 -> v1.2.0a0 2025-09-07T07:51:36.8591508Z * [new tag] v1.3.0 -> v1.3.0 2025-09-07T07:51:36.8592679Z * [new tag] v1.3.0a0 -> v1.3.0a0 2025-09-07T07:51:36.8593569Z * [new tag] v1.3.1 -> v1.3.1 2025-09-07T07:51:36.8594672Z * [new tag] v1.4.0 -> v1.4.0 2025-09-07T07:51:36.8596042Z * [new tag] v1.4.0a0 -> v1.4.0a0 2025-09-07T07:51:36.8596992Z * [new tag] v1.4.1 -> v1.4.1 2025-09-07T07:51:36.8598236Z * [new tag] v1.5.0 -> v1.5.0 2025-09-07T07:51:36.8599620Z * [new tag] v1.5.0-rc1 -> v1.5.0-rc1 2025-09-07T07:51:36.8600796Z * [new tag] v1.5.0-rc2 -> v1.5.0-rc2 2025-09-07T07:51:36.8601987Z * [new tag] v1.5.0-rc3 -> v1.5.0-rc3 2025-09-07T07:51:36.8603066Z * [new tag] v1.5.0-rc4 -> v1.5.0-rc4 2025-09-07T07:51:36.8603958Z * [new tag] v1.5.0-rc5 -> v1.5.0-rc5 2025-09-07T07:51:36.8605317Z * [new tag] v1.5.1 -> v1.5.1 2025-09-07T07:51:36.8606461Z * [new tag] v1.5.1-rc1 -> v1.5.1-rc1 2025-09-07T07:51:36.8607343Z * [new tag] v1.6.0 -> v1.6.0 2025-09-07T07:51:36.8608516Z * [new tag] v1.6.0-rc1 -> v1.6.0-rc1 2025-09-07T07:51:36.8609704Z * [new tag] v1.6.0-rc2 -> v1.6.0-rc2 2025-09-07T07:51:36.8610922Z * [new tag] v1.6.0-rc3 -> v1.6.0-rc3 2025-09-07T07:51:36.8612059Z * [new tag] v1.6.0-rc4 -> v1.6.0-rc4 2025-09-07T07:51:36.8613236Z * [new tag] v1.6.0-rc5 -> v1.6.0-rc5 2025-09-07T07:51:36.8614328Z * [new tag] v1.6.0-rc6 -> v1.6.0-rc6 2025-09-07T07:51:36.8615434Z * [new tag] v1.6.0-rc7 -> v1.6.0-rc7 2025-09-07T07:51:36.8616622Z * [new tag] v1.7.0 -> v1.7.0 2025-09-07T07:51:36.8617841Z * [new tag] v1.7.0-rc1 -> v1.7.0-rc1 2025-09-07T07:51:36.8619083Z * [new tag] v1.7.0-rc2 -> v1.7.0-rc2 2025-09-07T07:51:36.8620231Z * [new tag] v1.7.0-rc3 -> v1.7.0-rc3 2025-09-07T07:51:36.8621142Z * [new tag] v1.7.0-rc4 -> v1.7.0-rc4 2025-09-07T07:51:36.8622450Z * [new tag] v1.7.1 -> v1.7.1 2025-09-07T07:51:36.8623788Z * [new tag] v1.7.1-rc1 -> v1.7.1-rc1 2025-09-07T07:51:36.8624933Z * [new tag] v1.7.1-rc2 -> v1.7.1-rc2 2025-09-07T07:51:36.8626890Z * [new tag] v1.7.1-rc3 -> v1.7.1-rc3 2025-09-07T07:51:36.8628051Z * [new tag] v1.8.0 -> v1.8.0 2025-09-07T07:51:36.8629080Z * [new tag] v1.8.0-rc1 -> v1.8.0-rc1 2025-09-07T07:51:36.8630525Z * [new tag] v1.8.0-rc2 -> v1.8.0-rc2 2025-09-07T07:51:36.8631695Z * [new tag] v1.8.0-rc3 -> v1.8.0-rc3 2025-09-07T07:51:36.8632813Z * [new tag] v1.8.0-rc4 -> v1.8.0-rc4 2025-09-07T07:51:36.8633756Z * [new tag] v1.8.0-rc5 -> v1.8.0-rc5 2025-09-07T07:51:36.8634805Z * [new tag] v1.8.1 -> v1.8.1 2025-09-07T07:51:36.8636429Z * [new tag] v1.8.1-rc1 -> v1.8.1-rc1 2025-09-07T07:51:36.8637242Z * [new tag] v1.8.1-rc2 -> v1.8.1-rc2 2025-09-07T07:51:36.8638389Z * [new tag] v1.8.1-rc3 -> v1.8.1-rc3 2025-09-07T07:51:36.8640105Z * [new tag] v1.8.2 -> v1.8.2 2025-09-07T07:51:36.8641032Z * [new tag] v1.8.2-rc1 -> v1.8.2-rc1 2025-09-07T07:51:36.8642235Z * [new tag] v1.9.0 -> v1.9.0 2025-09-07T07:51:36.8643368Z * [new tag] v1.9.0-rc1 -> v1.9.0-rc1 2025-09-07T07:51:36.8644645Z * [new tag] v1.9.0-rc2 -> v1.9.0-rc2 2025-09-07T07:51:36.8646093Z * [new tag] v1.9.0-rc3 -> v1.9.0-rc3 2025-09-07T07:51:36.8647044Z * [new tag] v1.9.0-rc4 -> v1.9.0-rc4 2025-09-07T07:51:36.8648285Z * [new tag] v1.9.1 -> v1.9.1 2025-09-07T07:51:36.8649664Z * [new tag] v1.9.1-rc1 -> v1.9.1-rc1 2025-09-07T07:51:36.8650630Z * [new tag] v1.9.1-rc2 -> v1.9.1-rc2 2025-09-07T07:51:36.8651784Z * [new tag] v2.0.0 -> v2.0.0 2025-09-07T07:51:36.8653040Z * [new tag] v2.0.0-rc1 -> v2.0.0-rc1 2025-09-07T07:51:36.8654306Z * [new tag] v2.0.0-rc2 -> v2.0.0-rc2 2025-09-07T07:51:36.8655680Z * [new tag] v2.0.0-rc3 -> v2.0.0-rc3 2025-09-07T07:51:36.8656970Z * [new tag] v2.0.0-rc4 -> v2.0.0-rc4 2025-09-07T07:51:36.8658206Z * [new tag] v2.0.0-rc5 -> v2.0.0-rc5 2025-09-07T07:51:36.8659374Z * [new tag] v2.0.0-rc6 -> v2.0.0-rc6 2025-09-07T07:51:36.8660648Z * [new tag] v2.0.1 -> v2.0.1 2025-09-07T07:51:36.8661993Z * [new tag] v2.0.1-rc1 -> v2.0.1-rc1 2025-09-07T07:51:36.8662968Z * [new tag] v2.0.1-rc2 -> v2.0.1-rc2 2025-09-07T07:51:36.8664131Z * [new tag] v2.0.1-rc3 -> v2.0.1-rc3 2025-09-07T07:51:36.8665252Z * [new tag] v2.0.1-rc4 -> v2.0.1-rc4 2025-09-07T07:51:36.8666939Z * [new tag] v2.1.0 -> v2.1.0 2025-09-07T07:51:36.8668177Z * [new tag] v2.1.0-rc1 -> v2.1.0-rc1 2025-09-07T07:51:36.8669416Z * [new tag] v2.1.0-rc2 -> v2.1.0-rc2 2025-09-07T07:51:36.8670661Z * [new tag] v2.1.0-rc3 -> v2.1.0-rc3 2025-09-07T07:51:36.8671901Z * [new tag] v2.1.0-rc4 -> v2.1.0-rc4 2025-09-07T07:51:36.8673096Z * [new tag] v2.1.0-rc5 -> v2.1.0-rc5 2025-09-07T07:51:36.8674153Z * [new tag] v2.1.0-rc6 -> v2.1.0-rc6 2025-09-07T07:51:36.8675568Z * [new tag] v2.1.1 -> v2.1.1 2025-09-07T07:51:36.8676825Z * [new tag] v2.1.1-rc1 -> v2.1.1-rc1 2025-09-07T07:51:36.8678022Z * [new tag] v2.1.1-rc2 -> v2.1.1-rc2 2025-09-07T07:51:36.8679572Z * [new tag] v2.1.1-rc3 -> v2.1.1-rc3 2025-09-07T07:51:36.8680775Z * [new tag] v2.1.1-rc4 -> v2.1.1-rc4 2025-09-07T07:51:36.8681990Z * [new tag] v2.1.1-rc5 -> v2.1.1-rc5 2025-09-07T07:51:36.8682971Z * [new tag] v2.1.1-rc6 -> v2.1.1-rc6 2025-09-07T07:51:36.8684229Z * [new tag] v2.1.2 -> v2.1.2 2025-09-07T07:51:36.8685665Z * [new tag] v2.1.2-rc1 -> v2.1.2-rc1 2025-09-07T07:51:36.8686826Z * [new tag] v2.1.2-rc2 -> v2.1.2-rc2 2025-09-07T07:51:36.8687983Z * [new tag] v2.1.2-rc3 -> v2.1.2-rc3 2025-09-07T07:51:36.8689121Z * [new tag] v2.2.0 -> v2.2.0 2025-09-07T07:51:36.8690405Z * [new tag] v2.2.0-rc1 -> v2.2.0-rc1 2025-09-07T07:51:36.8691559Z * [new tag] v2.2.0-rc2 -> v2.2.0-rc2 2025-09-07T07:51:36.8692683Z * [new tag] v2.2.0-rc3 -> v2.2.0-rc3 2025-09-07T07:51:36.8693853Z * [new tag] v2.2.0-rc4 -> v2.2.0-rc4 2025-09-07T07:51:36.8695196Z * [new tag] v2.2.0-rc5 -> v2.2.0-rc5 2025-09-07T07:51:36.8696550Z * [new tag] v2.2.0-rc6 -> v2.2.0-rc6 2025-09-07T07:51:36.8697519Z * [new tag] v2.2.0-rc7 -> v2.2.0-rc7 2025-09-07T07:51:36.8698523Z * [new tag] v2.2.0-rc8 -> v2.2.0-rc8 2025-09-07T07:51:36.8699748Z * [new tag] v2.2.1 -> v2.2.1 2025-09-07T07:51:36.8701094Z * [new tag] v2.2.1-rc1 -> v2.2.1-rc1 2025-09-07T07:51:36.8702229Z * [new tag] v2.2.1-rc2 -> v2.2.1-rc2 2025-09-07T07:51:36.8703325Z * [new tag] v2.2.1-rc3 -> v2.2.1-rc3 2025-09-07T07:51:36.8704301Z * [new tag] v2.2.2 -> v2.2.2 2025-09-07T07:51:36.8705874Z * [new tag] v2.2.2-rc1 -> v2.2.2-rc1 2025-09-07T07:51:36.8706908Z * [new tag] v2.2.2-rc2 -> v2.2.2-rc2 2025-09-07T07:51:36.8707929Z * [new tag] v2.2.2-rc3 -> v2.2.2-rc3 2025-09-07T07:51:36.8709264Z * [new tag] v2.3.0 -> v2.3.0 2025-09-07T07:51:36.8710738Z * [new tag] v2.3.0-rc1 -> v2.3.0-rc1 2025-09-07T07:51:36.8711988Z * [new tag] v2.3.0-rc10 -> v2.3.0-rc10 2025-09-07T07:51:36.8713362Z * [new tag] v2.3.0-rc11 -> v2.3.0-rc11 2025-09-07T07:51:36.8714357Z * [new tag] v2.3.0-rc12 -> v2.3.0-rc12 2025-09-07T07:51:36.8715815Z * [new tag] v2.3.0-rc2 -> v2.3.0-rc2 2025-09-07T07:51:36.8716978Z * [new tag] v2.3.0-rc3 -> v2.3.0-rc3 2025-09-07T07:51:36.8718269Z * [new tag] v2.3.0-rc4 -> v2.3.0-rc4 2025-09-07T07:51:36.8719476Z * [new tag] v2.3.0-rc5 -> v2.3.0-rc5 2025-09-07T07:51:36.8720509Z * [new tag] v2.3.0-rc6 -> v2.3.0-rc6 2025-09-07T07:51:36.8721767Z * [new tag] v2.3.0-rc7 -> v2.3.0-rc7 2025-09-07T07:51:36.8723011Z * [new tag] v2.3.0-rc8 -> v2.3.0-rc8 2025-09-07T07:51:36.8724018Z * [new tag] v2.3.0-rc9 -> v2.3.0-rc9 2025-09-07T07:51:36.8725143Z * [new tag] v2.3.1 -> v2.3.1 2025-09-07T07:51:36.8726556Z * [new tag] v2.3.1-rc1 -> v2.3.1-rc1 2025-09-07T07:51:36.8727795Z * [new tag] v2.3.1-rc2 -> v2.3.1-rc2 2025-09-07T07:51:36.8729870Z * [new tag] v2.3.1-rc3 -> v2.3.1-rc3 2025-09-07T07:51:36.8731088Z * [new tag] v2.4.0 -> v2.4.0 2025-09-07T07:51:36.8732194Z * [new tag] v2.4.0-rc1 -> v2.4.0-rc1 2025-09-07T07:51:36.8733413Z * [new tag] v2.4.0-rc2 -> v2.4.0-rc2 2025-09-07T07:51:36.8734654Z * [new tag] v2.4.0-rc3 -> v2.4.0-rc3 2025-09-07T07:51:36.8736097Z * [new tag] v2.4.0-rc4 -> v2.4.0-rc4 2025-09-07T07:51:36.8737317Z * [new tag] v2.4.0-rc5 -> v2.4.0-rc5 2025-09-07T07:51:36.8738883Z * [new tag] v2.4.0-rc6 -> v2.4.0-rc6 2025-09-07T07:51:36.8740060Z * [new tag] v2.4.0-rc7 -> v2.4.0-rc7 2025-09-07T07:51:36.8741223Z * [new tag] v2.4.0-rc8 -> v2.4.0-rc8 2025-09-07T07:51:36.8742691Z * [new tag] v2.4.0-rc9 -> v2.4.0-rc9 2025-09-07T07:51:36.8743776Z * [new tag] v2.4.1 -> v2.4.1 2025-09-07T07:51:36.8745143Z * [new tag] v2.4.1-rc1 -> v2.4.1-rc1 2025-09-07T07:51:36.8746452Z * [new tag] v2.4.1-rc2 -> v2.4.1-rc2 2025-09-07T07:51:36.8747697Z * [new tag] v2.4.1-rc3 -> v2.4.1-rc3 2025-09-07T07:51:36.8748897Z * [new tag] v2.5.0 -> v2.5.0 2025-09-07T07:51:36.8750035Z * [new tag] v2.5.0-rc1 -> v2.5.0-rc1 2025-09-07T07:51:36.8751031Z * [new tag] v2.5.0-rc10 -> v2.5.0-rc10 2025-09-07T07:51:36.8752263Z * [new tag] v2.5.0-rc2 -> v2.5.0-rc2 2025-09-07T07:51:36.8753475Z * [new tag] v2.5.0-rc3 -> v2.5.0-rc3 2025-09-07T07:51:36.8754581Z * [new tag] v2.5.0-rc4 -> v2.5.0-rc4 2025-09-07T07:51:36.8756139Z * [new tag] v2.5.0-rc5 -> v2.5.0-rc5 2025-09-07T07:51:36.8757342Z * [new tag] v2.5.0-rc6 -> v2.5.0-rc6 2025-09-07T07:51:36.8758744Z * [new tag] v2.5.0-rc7 -> v2.5.0-rc7 2025-09-07T07:51:36.8760050Z * [new tag] v2.5.0-rc8 -> v2.5.0-rc8 2025-09-07T07:51:36.8761214Z * [new tag] v2.5.0-rc9 -> v2.5.0-rc9 2025-09-07T07:51:36.8762224Z * [new tag] v2.5.1 -> v2.5.1 2025-09-07T07:51:36.8763227Z * [new tag] v2.5.1-rc1 -> v2.5.1-rc1 2025-09-07T07:51:36.8764184Z * [new tag] v2.6.0 -> v2.6.0 2025-09-07T07:51:36.8765884Z * [new tag] v2.6.0-rc1 -> v2.6.0-rc1 2025-09-07T07:51:36.8767251Z * [new tag] v2.6.0-rc2 -> v2.6.0-rc2 2025-09-07T07:51:36.8768579Z * [new tag] v2.6.0-rc3 -> v2.6.0-rc3 2025-09-07T07:51:36.8769726Z * [new tag] v2.6.0-rc4 -> v2.6.0-rc4 2025-09-07T07:51:36.8771056Z * [new tag] v2.6.0-rc5 -> v2.6.0-rc5 2025-09-07T07:51:36.8772466Z * [new tag] v2.6.0-rc6 -> v2.6.0-rc6 2025-09-07T07:51:36.8773695Z * [new tag] v2.6.0-rc7 -> v2.6.0-rc7 2025-09-07T07:51:36.8775097Z * [new tag] v2.6.0-rc8 -> v2.6.0-rc8 2025-09-07T07:51:36.8776443Z * [new tag] v2.6.0-rc9 -> v2.6.0-rc9 2025-09-07T07:51:36.8777748Z * [new tag] v2.7.0 -> v2.7.0 2025-09-07T07:51:36.8779017Z * [new tag] v2.7.0-rc1 -> v2.7.0-rc1 2025-09-07T07:51:36.8780033Z * [new tag] v2.7.0-rc10 -> v2.7.0-rc10 2025-09-07T07:51:36.8781315Z * [new tag] v2.7.0-rc2 -> v2.7.0-rc2 2025-09-07T07:51:36.8782716Z * [new tag] v2.7.0-rc3 -> v2.7.0-rc3 2025-09-07T07:51:36.8784014Z * [new tag] v2.7.0-rc4 -> v2.7.0-rc4 2025-09-07T07:51:36.8785232Z * [new tag] v2.7.0-rc5 -> v2.7.0-rc5 2025-09-07T07:51:36.8786518Z * [new tag] v2.7.0-rc6 -> v2.7.0-rc6 2025-09-07T07:51:36.8787806Z * [new tag] v2.7.0-rc7 -> v2.7.0-rc7 2025-09-07T07:51:36.8789071Z * [new tag] v2.7.0-rc8 -> v2.7.0-rc8 2025-09-07T07:51:36.8790583Z * [new tag] v2.7.0-rc9 -> v2.7.0-rc9 2025-09-07T07:51:36.8791493Z * [new tag] v2.7.1 -> v2.7.1 2025-09-07T07:51:36.8792743Z * [new tag] v2.7.1-rc1 -> v2.7.1-rc1 2025-09-07T07:51:36.8794017Z * [new tag] v2.7.1-rc2 -> v2.7.1-rc2 2025-09-07T07:51:36.8795467Z * [new tag] v2.7.1-rc3 -> v2.7.1-rc3 2025-09-07T07:51:36.8796779Z * [new tag] v2.7.1-rc4 -> v2.7.1-rc4 2025-09-07T07:51:36.8797886Z * [new tag] v2.7.1-rc5 -> v2.7.1-rc5 2025-09-07T07:51:36.8799118Z * [new tag] v2.8.0 -> v2.8.0 2025-09-07T07:51:36.8800383Z * [new tag] v2.8.0-rc1 -> v2.8.0-rc1 2025-09-07T07:51:36.8801586Z * [new tag] v2.8.0-rc2 -> v2.8.0-rc2 2025-09-07T07:51:36.8802887Z * [new tag] v2.8.0-rc3 -> v2.8.0-rc3 2025-09-07T07:51:36.8804183Z * [new tag] v2.8.0-rc4 -> v2.8.0-rc4 2025-09-07T07:51:36.8805781Z * [new tag] v2.8.0-rc5 -> v2.8.0-rc5 2025-09-07T07:51:36.8807160Z * [new tag] v2.8.0-rc6 -> v2.8.0-rc6 2025-09-07T07:51:36.8808346Z * [new tag] v2.8.0-rc7 -> v2.8.0-rc7 2025-09-07T07:51:36.8809742Z * [new tag] v2.8.0-rc8 -> v2.8.0-rc8 2025-09-07T07:51:36.8810878Z * [new tag] whc_flight_1 -> whc_flight_1 2025-09-07T07:51:36.8812141Z * [new tag] whc_flight_2 -> whc_flight_2 2025-09-07T07:51:36.8813275Z * [new tag] whc_flight_4 -> whc_flight_4 2025-09-07T07:51:36.9650454Z [command]/usr/bin/git rev-parse --verify --quiet 93fb23d6fae7c4e82c4239a1033e522088742634^{object} 2025-09-07T07:51:36.9680990Z 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:51:36.9686051Z ##[endgroup] 2025-09-07T07:51:36.9686334Z ##[group]Determining the checkout info 2025-09-07T07:51:36.9686801Z ##[endgroup] 2025-09-07T07:51:36.9690401Z [command]/usr/bin/git sparse-checkout disable 2025-09-07T07:51:36.9740211Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-09-07T07:51:36.9774294Z ##[group]Checking out the ref 2025-09-07T07:51:36.9777586Z [command]/usr/bin/git checkout --progress --force 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:51:38.0133524Z Updating files: 75% (14682/19405) 2025-09-07T07:51:38.0265422Z Updating files: 76% (14748/19405) 2025-09-07T07:51:38.0367118Z Updating files: 77% (14942/19405) 2025-09-07T07:51:38.0582601Z Updating files: 78% (15136/19405) 2025-09-07T07:51:38.0791659Z Updating files: 79% (15330/19405) 2025-09-07T07:51:38.1038427Z Updating files: 80% (15524/19405) 2025-09-07T07:51:38.1262796Z Updating files: 81% (15719/19405) 2025-09-07T07:51:38.1471294Z Updating files: 82% (15913/19405) 2025-09-07T07:51:38.1580604Z Updating files: 83% (16107/19405) 2025-09-07T07:51:38.1705871Z Updating files: 84% (16301/19405) 2025-09-07T07:51:38.1852081Z Updating files: 85% (16495/19405) 2025-09-07T07:51:38.1979438Z Updating files: 86% (16689/19405) 2025-09-07T07:51:38.2105351Z Updating files: 87% (16883/19405) 2025-09-07T07:51:38.2202390Z Updating files: 88% (17077/19405) 2025-09-07T07:51:38.2334222Z Updating files: 89% (17271/19405) 2025-09-07T07:51:38.2493211Z Updating files: 90% (17465/19405) 2025-09-07T07:51:38.2600174Z Updating files: 91% (17659/19405) 2025-09-07T07:51:38.2734039Z Updating files: 92% (17853/19405) 2025-09-07T07:51:38.2904756Z Updating files: 93% (18047/19405) 2025-09-07T07:51:38.3092458Z Updating files: 94% (18241/19405) 2025-09-07T07:51:38.3237515Z Updating files: 95% (18435/19405) 2025-09-07T07:51:38.3383890Z Updating files: 96% (18629/19405) 2025-09-07T07:51:38.3548165Z Updating files: 97% (18823/19405) 2025-09-07T07:51:38.3787993Z Updating files: 98% (19017/19405) 2025-09-07T07:51:38.3929670Z Updating files: 99% (19211/19405) 2025-09-07T07:51:38.3929959Z Updating files: 100% (19405/19405) 2025-09-07T07:51:38.3930232Z Updating files: 100% (19405/19405), done. 2025-09-07T07:51:38.5023144Z Note: switching to '93fb23d6fae7c4e82c4239a1033e522088742634'. 2025-09-07T07:51:38.5023415Z 2025-09-07T07:51:38.5023613Z You are in 'detached HEAD' state. You can look around, make experimental 2025-09-07T07:51:38.5024114Z changes and commit them, and you can discard any commits you make in this 2025-09-07T07:51:38.5024599Z state without impacting any branches by switching back to a branch. 2025-09-07T07:51:38.5024881Z 2025-09-07T07:51:38.5025432Z If you want to create a new branch to retain commits you create, you may 2025-09-07T07:51:38.5025886Z do so (now or later) by using -c with the switch command. Example: 2025-09-07T07:51:38.5026139Z 2025-09-07T07:51:38.5026246Z git switch -c 2025-09-07T07:51:38.5026439Z 2025-09-07T07:51:38.5026557Z Or undo this operation with: 2025-09-07T07:51:38.5026740Z 2025-09-07T07:51:38.5026827Z git switch - 2025-09-07T07:51:38.5026953Z 2025-09-07T07:51:38.5027190Z Turn off this advice by setting config variable advice.detachedHead to false 2025-09-07T07:51:38.5027500Z 2025-09-07T07:51:38.5027676Z HEAD is now at 93fb23d6fae Build vLLM nightly wheels (#162000) 2025-09-07T07:51:38.5162345Z ##[endgroup] 2025-09-07T07:51:38.5162747Z ##[group]Setting up auth for fetching submodules 2025-09-07T07:51:38.5169317Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-09-07T07:51:38.5211863Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-09-07T07:51:38.5246646Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-09-07T07:51:38.5278394Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-09-07T07:51:38.5306463Z ##[endgroup] 2025-09-07T07:51:38.5306840Z ##[group]Fetching submodules 2025-09-07T07:51:38.5309571Z [command]/usr/bin/git submodule sync --recursive 2025-09-07T07:51:38.5592495Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-09-07T07:51:38.5863860Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2025-09-07T07:51:38.5873800Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2025-09-07T07:51:38.5885164Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2025-09-07T07:51:38.5896385Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2025-09-07T07:51:38.5907452Z Submodule 'third_party/NVTX' (https://github.com/NVIDIA/NVTX.git) registered for path 'third_party/NVTX' 2025-09-07T07:51:38.5918591Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2025-09-07T07:51:38.5929273Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2025-09-07T07:51:38.5940233Z Submodule 'third_party/aiter' (https://github.com/ROCm/aiter.git) registered for path 'third_party/aiter' 2025-09-07T07:51:38.5951046Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2025-09-07T07:51:38.5963255Z Submodule 'third_party/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/composable_kernel' 2025-09-07T07:51:38.5974411Z Submodule 'third_party/cpp-httplib' (https://github.com/yhirose/cpp-httplib.git) registered for path 'third_party/cpp-httplib' 2025-09-07T07:51:38.5986347Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2025-09-07T07:51:38.5997539Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2025-09-07T07:51:38.6008771Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2025-09-07T07:51:38.6019948Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2025-09-07T07:51:38.6031009Z Submodule 'third_party/flash-attention' (https://github.com/Dao-AILab/flash-attention.git) registered for path 'third_party/flash-attention' 2025-09-07T07:51:38.6042138Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2025-09-07T07:51:38.6053578Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2025-09-07T07:51:38.6065156Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:51:38.6076825Z Submodule 'third_party/gloo' (https://github.com/pytorch/gloo) registered for path 'third_party/gloo' 2025-09-07T07:51:38.6088511Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2025-09-07T07:51:38.6100018Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2025-09-07T07:51:38.6111535Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2025-09-07T07:51:38.6123117Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2025-09-07T07:51:38.6134925Z Submodule 'third_party/kleidiai' (https://github.com/ARM-software/kleidiai.git) registered for path 'third_party/kleidiai' 2025-09-07T07:51:38.6146761Z Submodule 'third_party/mimalloc' (https://github.com/microsoft/mimalloc.git) registered for path 'third_party/mimalloc' 2025-09-07T07:51:38.6158254Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2025-09-07T07:51:38.6169650Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2025-09-07T07:51:38.6181687Z Submodule 'third_party/opentelemetry-cpp' (https://github.com/open-telemetry/opentelemetry-cpp.git) registered for path 'third_party/opentelemetry-cpp' 2025-09-07T07:51:38.6193244Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2025-09-07T07:51:38.6205278Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2025-09-07T07:51:38.6217311Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2025-09-07T07:51:38.6229322Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2025-09-07T07:51:38.6241250Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2025-09-07T07:51:38.6253181Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2025-09-07T07:51:38.6265165Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2025-09-07T07:51:38.6277213Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2025-09-07T07:51:38.6311816Z Cloning into '/home/david/_work/pytorch/pytorch/android/libs/fbjni'... 2025-09-07T07:51:39.0030988Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/FP16'... 2025-09-07T07:51:39.3105122Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/FXdiv'... 2025-09-07T07:51:39.5984648Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/NNPACK'... 2025-09-07T07:51:40.0088048Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/NVTX'... 2025-09-07T07:51:40.5143062Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2025-09-07T07:51:41.6478192Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/XNNPACK'... 2025-09-07T07:52:13.5690385Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/aiter'... 2025-09-07T07:52:20.9655510Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/benchmark'... 2025-09-07T07:52:22.2134197Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/composable_kernel'... 2025-09-07T07:52:31.3624719Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/cpp-httplib'... 2025-09-07T07:52:34.8293107Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/cpuinfo'... 2025-09-07T07:52:38.2204603Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2025-09-07T07:52:43.3056630Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/cutlass'... 2025-09-07T07:52:49.0938327Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/fbgemm'... 2025-09-07T07:52:51.4010785Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/flash-attention'... 2025-09-07T07:52:52.3599541Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/flatbuffers'... 2025-09-07T07:52:54.0355849Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/fmt'... 2025-09-07T07:52:57.0908818Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2025-09-07T07:52:59.2453088Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/gloo'... 2025-09-07T07:53:02.0901869Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/googletest'... 2025-09-07T07:53:04.8890651Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/ideep'... 2025-09-07T07:53:05.6967694Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/ittapi'... 2025-09-07T07:53:06.2395343Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto'... 2025-09-07T07:53:07.3915965Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kleidiai'... 2025-09-07T07:53:07.9718691Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/mimalloc'... 2025-09-07T07:53:08.9156413Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/nlohmann'... 2025-09-07T07:53:14.6826022Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/onnx'... 2025-09-07T07:53:17.2190355Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/opentelemetry-cpp'... 2025-09-07T07:53:22.1007407Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/pocketfft'... 2025-09-07T07:53:22.4318008Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/protobuf'... 2025-09-07T07:53:29.8563039Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/psimd'... 2025-09-07T07:53:30.1396292Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/pthreadpool'... 2025-09-07T07:53:30.5298800Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/pybind11'... 2025-09-07T07:53:31.4009970Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/python-peachpy'... 2025-09-07T07:53:31.8344813Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/sleef'... 2025-09-07T07:53:32.7569352Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/tensorpipe'... 2025-09-07T07:53:33.2680235Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-09-07T07:53:33.2812181Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-09-07T07:53:33.2916161Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-09-07T07:53:33.3170942Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-09-07T07:53:33.3931334Z Submodule path 'third_party/NVTX': checked out '2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07' 2025-09-07T07:53:33.4458753Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-09-07T07:53:34.1904205Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-09-07T07:53:34.3460587Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-09-07T07:53:34.3497083Z Submodule '3rdparty/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:53:34.3529804Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/aiter/3rdparty/composable_kernel'... 2025-09-07T07:53:37.5721766Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-09-07T07:53:37.5970808Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-09-07T07:53:37.9302629Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-09-07T07:53:37.9800951Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-09-07T07:53:38.0768500Z Submodule path 'third_party/cpuinfo': checked out '5e3d2445e6a84d9599bee2bf78edbb4d80865e1d' 2025-09-07T07:53:38.1203133Z Submodule path 'third_party/cudnn_frontend': checked out 'f937055efc6d414d11f4c6577e3977fe74f35fb6' 2025-09-07T07:53:38.7560506Z Submodule path 'third_party/cutlass': checked out 'e51efbfe18fe4f4cbb66ab814c55bf4aa0185491' 2025-09-07T07:53:38.9010293Z Submodule path 'third_party/fbgemm': checked out '4b39c551efe15e6bbade20565b0ceb2d8ce3352d' 2025-09-07T07:53:38.9049434Z Submodule 'external/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/external/asmjit' 2025-09-07T07:53:38.9062011Z Submodule 'external/composable_kernel' (https://github.com/jwfromm/composable_kernel.git) registered for path 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:53:38.9074761Z Submodule 'external/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:53:38.9087894Z Submodule 'external/cutlass' (https://github.com/jwfromm/cutlass) registered for path 'third_party/fbgemm/external/cutlass' 2025-09-07T07:53:38.9100941Z Submodule 'external/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/external/googletest' 2025-09-07T07:53:38.9113153Z Submodule 'external/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:53:38.9125090Z Submodule 'external/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/fbgemm/external/json' 2025-09-07T07:53:38.9158596Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/fbgemm/external/asmjit'... 2025-09-07T07:53:40.7448228Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/fbgemm/external/composable_kernel'... 2025-09-07T07:53:42.0178671Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/fbgemm/external/cpuinfo'... 2025-09-07T07:53:42.7256274Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/fbgemm/external/cutlass'... 2025-09-07T07:53:44.3919430Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/fbgemm/external/googletest'... 2025-09-07T07:53:45.2864756Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/fbgemm/external/hipify_torch'... 2025-09-07T07:53:45.6983557Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/fbgemm/external/json'... 2025-09-07T07:53:51.1630154Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-09-07T07:53:51.4287269Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out 'b1281b8b08d973a7064f864f47eeb30f3e2596e9' 2025-09-07T07:53:51.5275388Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-09-07T07:53:52.1655770Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '311f3c8e51dc0eb56310cfc6980bf63d0fbd7917' 2025-09-07T07:53:52.2114145Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-09-07T07:53:52.2252712Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-09-07T07:53:52.3301560Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-09-07T07:53:52.4032822Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-09-07T07:53:52.4065129Z Submodule 'csrc/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:53:52.4076783Z Submodule 'csrc/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:53:52.4107709Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/flash-attention/csrc/composable_kernel'... 2025-09-07T07:53:55.4510022Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/flash-attention/csrc/cutlass'... 2025-09-07T07:53:57.6197693Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-09-07T07:53:58.1789614Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-09-07T07:53:58.3188048Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-09-07T07:53:58.3512429Z Submodule path 'third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-09-07T07:53:58.3908127Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-09-07T07:53:58.4161947Z Submodule path 'third_party/gloo': checked out 'c7b7b022c124d9643957d9bd55f57ac59fce8fa2' 2025-09-07T07:53:58.4598953Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-09-07T07:53:58.4735378Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-09-07T07:53:58.4762370Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2025-09-07T07:53:58.4791262Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2025-09-07T07:54:10.6769668Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-09-07T07:54:10.7012599Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-09-07T07:54:10.7871074Z Submodule path 'third_party/kineto': checked out '5e7501833f1021ce6f618572d3baf657b6319658' 2025-09-07T07:54:10.7968306Z Submodule 'libkineto/third_party/dynolog' (https://github.com/facebookincubator/dynolog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:54:10.8098961Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:54:10.8276819Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:54:10.8310622Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog'... 2025-09-07T07:54:11.7047019Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2025-09-07T07:54:13.3897986Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2025-09-07T07:54:14.9652868Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e' 2025-09-07T07:54:15.0437698Z Submodule 'third_party/DCGM' (https://github.com/NVIDIA/DCGM.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:54:15.0808422Z Submodule 'third_party/cpr' (https://github.com/libcpr/cpr.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:54:15.1047240Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:54:15.1135625Z Submodule 'third_party/gflags' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:54:15.1368470Z Submodule 'third_party/glog' (https://github.com/google/glog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:54:15.1576398Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:54:15.1756443Z Submodule 'third_party/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:54:15.1937318Z Submodule 'third_party/pfs' (https://github.com/dtrugman/pfs.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:54:15.1973669Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'... 2025-09-07T07:54:16.4971932Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'... 2025-09-07T07:54:16.9272738Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'... 2025-09-07T07:54:17.8980898Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'... 2025-09-07T07:54:18.3199032Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/glog'... 2025-09-07T07:54:18.8555924Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'... 2025-09-07T07:54:19.7106640Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json'... 2025-09-07T07:54:26.9677097Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'... 2025-09-07T07:54:27.8058997Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-09-07T07:54:27.8275997Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-09-07T07:54:27.8686938Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-09-07T07:54:27.8962787Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-09-07T07:54:27.9162940Z Submodule 'doc' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:54:27.9193823Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'... 2025-09-07T07:54:28.9731777Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-09-07T07:54:28.9947114Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-09-07T07:54:29.0426695Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850' 2025-09-07T07:54:29.1459199Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-09-07T07:54:29.1644155Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-09-07T07:54:29.2130748Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164' 2025-09-07T07:54:29.2738284Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2025-09-07T07:54:29.3190826Z Submodule path 'third_party/kleidiai': checked out 'cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7' 2025-09-07T07:54:29.3663957Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-09-07T07:54:29.5052100Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-09-07T07:54:30.3402530Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-09-07T07:54:30.4235992Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2025-09-07T07:54:30.4712945Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2025-09-07T07:54:31.8972818Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-09-07T07:54:31.9871245Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-09-07T07:54:32.0166753Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark) registered for path 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:54:32.0308671Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:54:32.0534006Z Submodule 'third_party/ms-gsl' (https://github.com/microsoft/GSL) registered for path 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:54:32.0787201Z Submodule 'third_party/nlohmann-json' (https://github.com/nlohmann/json) registered for path 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:54:32.1165473Z Submodule 'third_party/opentelemetry-proto' (https://github.com/open-telemetry/opentelemetry-proto) registered for path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:54:32.1404407Z Submodule 'third_party/opentracing-cpp' (https://github.com/opentracing/opentracing-cpp.git) registered for path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:54:32.1527330Z Submodule 'third_party/prometheus-cpp' (https://github.com/jupp0r/prometheus-cpp) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:54:32.1694655Z Submodule 'tools/vcpkg' (https://github.com/Microsoft/vcpkg) registered for path 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:54:32.1724326Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/benchmark'... 2025-09-07T07:54:33.2186618Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/googletest'... 2025-09-07T07:54:34.0879800Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/ms-gsl'... 2025-09-07T07:54:34.4850354Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/nlohmann-json'... 2025-09-07T07:54:41.3266806Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentelemetry-proto'... 2025-09-07T07:54:41.6912952Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp'... 2025-09-07T07:54:42.0344856Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp'... 2025-09-07T07:54:42.4787686Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/opentelemetry-cpp/tools/vcpkg'... 2025-09-07T07:54:48.6422229Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-09-07T07:54:48.6826739Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-09-07T07:54:48.6995970Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-09-07T07:54:48.8063589Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-09-07T07:54:48.8214460Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-09-07T07:54:48.8372529Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-09-07T07:54:48.8541495Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-09-07T07:54:48.8567420Z Submodule 'civetweb' (https://github.com/civetweb/civetweb.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:54:48.8578682Z Submodule 'googletest' (https://github.com/google/googletest.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:54:48.8609939Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'... 2025-09-07T07:54:50.3798932Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'... 2025-09-07T07:54:51.5431395Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-09-07T07:54:51.5894414Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-09-07T07:54:52.1382538Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-09-07T07:54:52.1520371Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-09-07T07:54:52.4277680Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-09-07T07:54:52.4312696Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:54:52.4326149Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2025-09-07T07:54:52.4360036Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2025-09-07T07:54:52.9512903Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2025-09-07T07:54:53.8776435Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-09-07T07:54:53.9454883Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-09-07T07:54:53.9564004Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-09-07T07:54:53.9695813Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-09-07T07:54:54.0086999Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-09-07T07:54:54.0405411Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-09-07T07:54:54.0838217Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-09-07T07:54:54.1117866Z Submodule path 'third_party/tensorpipe': checked out 'af0118d13e52f5a08841464a768e01a0bf3e3075' 2025-09-07T07:54:54.1149170Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:54:54.1161818Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:54:54.1174301Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:54:54.1187104Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:54:54.1216973Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2025-09-07T07:54:55.0339221Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2025-09-07T07:54:55.4107984Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2025-09-07T07:54:56.5262230Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2025-09-07T07:54:57.4892371Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-09-07T07:54:57.5055513Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-09-07T07:54:57.5762522Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-09-07T07:54:57.6052422Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-09-07T07:54:57.6078695Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:54:57.6109658Z Cloning into '/home/david/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2025-09-07T07:54:57.9481774Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-09-07T07:54:57.9528556Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-09-07T07:54:57.9798693Z Entering 'android/libs/fbjni' 2025-09-07T07:54:57.9843360Z Entering 'third_party/FP16' 2025-09-07T07:54:57.9886320Z Entering 'third_party/FXdiv' 2025-09-07T07:54:57.9929821Z Entering 'third_party/NNPACK' 2025-09-07T07:54:57.9973104Z Entering 'third_party/NVTX' 2025-09-07T07:54:58.0017174Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:54:58.0060102Z Entering 'third_party/XNNPACK' 2025-09-07T07:54:58.0117319Z Entering 'third_party/aiter' 2025-09-07T07:54:58.0160554Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:54:58.0212504Z Entering 'third_party/benchmark' 2025-09-07T07:54:58.0258225Z Entering 'third_party/composable_kernel' 2025-09-07T07:54:58.0308955Z Entering 'third_party/cpp-httplib' 2025-09-07T07:54:58.0351866Z Entering 'third_party/cpuinfo' 2025-09-07T07:54:58.0395264Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:54:58.0438479Z Entering 'third_party/cutlass' 2025-09-07T07:54:58.0490399Z Entering 'third_party/fbgemm' 2025-09-07T07:54:58.0534777Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:54:58.0576703Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:54:58.0624114Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:54:58.0665783Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:54:58.0716645Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:54:58.0757897Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:54:58.0798232Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:54:58.0843657Z Entering 'third_party/flash-attention' 2025-09-07T07:54:58.0887843Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:54:58.0934403Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:54:58.0985067Z Entering 'third_party/flatbuffers' 2025-09-07T07:54:58.1029840Z Entering 'third_party/fmt' 2025-09-07T07:54:58.1073166Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:54:58.1116398Z Entering 'third_party/gloo' 2025-09-07T07:54:58.1158955Z Entering 'third_party/googletest' 2025-09-07T07:54:58.1201848Z Entering 'third_party/ideep' 2025-09-07T07:54:58.1243473Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:54:58.1293399Z Entering 'third_party/ittapi' 2025-09-07T07:54:58.1338029Z Entering 'third_party/kineto' 2025-09-07T07:54:58.1379799Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:54:58.1420463Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:54:58.1463564Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:54:58.1505129Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:54:58.1546398Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:54:58.1586665Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:54:58.1631932Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:54:58.1673460Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:54:58.1714499Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:54:58.1756524Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:54:58.1800194Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:54:58.1840834Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:54:58.1885351Z Entering 'third_party/kleidiai' 2025-09-07T07:54:58.1930773Z Entering 'third_party/mimalloc' 2025-09-07T07:54:58.1974583Z Entering 'third_party/nlohmann' 2025-09-07T07:54:58.2018493Z Entering 'third_party/onnx' 2025-09-07T07:54:58.2075219Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:54:58.2123175Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:54:58.2167103Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:54:58.2207960Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:54:58.2248427Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:54:58.2289109Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:54:58.2331918Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:54:58.2372457Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:54:58.2413486Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:54:58.2452926Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:54:58.2498103Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:54:58.2541998Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:54:58.2602494Z Entering 'third_party/pocketfft' 2025-09-07T07:54:58.2645407Z Entering 'third_party/protobuf' 2025-09-07T07:54:58.2690088Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:54:58.2730891Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:54:58.2775285Z Entering 'third_party/psimd' 2025-09-07T07:54:58.2817663Z Entering 'third_party/pthreadpool' 2025-09-07T07:54:58.2860197Z Entering 'third_party/pybind11' 2025-09-07T07:54:58.2902930Z Entering 'third_party/python-peachpy' 2025-09-07T07:54:58.2945570Z Entering 'third_party/sleef' 2025-09-07T07:54:58.2988415Z Entering 'third_party/tensorpipe' 2025-09-07T07:54:58.3030579Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:54:58.3071388Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:54:58.3112517Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:54:58.3153043Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:54:58.3192871Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:54:58.3252982Z ##[endgroup] 2025-09-07T07:54:58.3254476Z ##[group]Persisting credentials for submodules 2025-09-07T07:54:58.3259851Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-09-07T07:54:58.3522487Z Entering 'android/libs/fbjni' 2025-09-07T07:54:58.3574250Z Entering 'third_party/FP16' 2025-09-07T07:54:58.3622384Z Entering 'third_party/FXdiv' 2025-09-07T07:54:58.3670598Z Entering 'third_party/NNPACK' 2025-09-07T07:54:58.3719363Z Entering 'third_party/NVTX' 2025-09-07T07:54:58.3768021Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:54:58.3817207Z Entering 'third_party/XNNPACK' 2025-09-07T07:54:58.3880035Z Entering 'third_party/aiter' 2025-09-07T07:54:58.3930898Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:54:58.3987566Z Entering 'third_party/benchmark' 2025-09-07T07:54:58.4037460Z Entering 'third_party/composable_kernel' 2025-09-07T07:54:58.4093137Z Entering 'third_party/cpp-httplib' 2025-09-07T07:54:58.4142298Z Entering 'third_party/cpuinfo' 2025-09-07T07:54:58.4191234Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:54:58.4241498Z Entering 'third_party/cutlass' 2025-09-07T07:54:58.4298670Z Entering 'third_party/fbgemm' 2025-09-07T07:54:58.4349090Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:54:58.4397254Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:54:58.4449796Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:54:58.4497047Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:54:58.4551562Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:54:58.4598821Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:54:58.4645814Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:54:58.4696108Z Entering 'third_party/flash-attention' 2025-09-07T07:54:58.4745450Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:54:58.4797923Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:54:58.4854472Z Entering 'third_party/flatbuffers' 2025-09-07T07:54:58.4907282Z Entering 'third_party/fmt' 2025-09-07T07:54:58.4954858Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:54:58.5002918Z Entering 'third_party/gloo' 2025-09-07T07:54:58.5050513Z Entering 'third_party/googletest' 2025-09-07T07:54:58.5098845Z Entering 'third_party/ideep' 2025-09-07T07:54:58.5144897Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:54:58.5201823Z Entering 'third_party/ittapi' 2025-09-07T07:54:58.5250180Z Entering 'third_party/kineto' 2025-09-07T07:54:58.5296855Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:54:58.5343327Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:54:58.5392645Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:54:58.5439734Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:54:58.5486342Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:54:58.5530446Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:54:58.5581646Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:54:58.5628158Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:54:58.5674598Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:54:58.5722646Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:54:58.5771406Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:54:58.5818087Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:54:58.5866810Z Entering 'third_party/kleidiai' 2025-09-07T07:54:58.5915847Z Entering 'third_party/mimalloc' 2025-09-07T07:54:58.5963721Z Entering 'third_party/nlohmann' 2025-09-07T07:54:58.6012815Z Entering 'third_party/onnx' 2025-09-07T07:54:58.6074613Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:54:58.6128331Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:54:58.6177143Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:54:58.6224422Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:54:58.6270455Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:54:58.6317234Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:54:58.6364201Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:54:58.6410613Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:54:58.6456956Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:54:58.6502442Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:54:58.6552087Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:54:58.6601513Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:54:58.6668317Z Entering 'third_party/pocketfft' 2025-09-07T07:54:58.6716587Z Entering 'third_party/protobuf' 2025-09-07T07:54:58.6766869Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:54:58.6813887Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:54:58.6863338Z Entering 'third_party/psimd' 2025-09-07T07:54:58.6911136Z Entering 'third_party/pthreadpool' 2025-09-07T07:54:58.6959013Z Entering 'third_party/pybind11' 2025-09-07T07:54:58.7007415Z Entering 'third_party/python-peachpy' 2025-09-07T07:54:58.7055831Z Entering 'third_party/sleef' 2025-09-07T07:54:58.7104144Z Entering 'third_party/tensorpipe' 2025-09-07T07:54:58.7151532Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:54:58.7199770Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:54:58.7246782Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:54:58.7292361Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:54:58.7337864Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:54:58.7414282Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-09-07T07:54:58.7680815Z Entering 'android/libs/fbjni' 2025-09-07T07:54:58.7723010Z file:/home/david/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-09-07T07:54:58.7744773Z Entering 'third_party/FP16' 2025-09-07T07:54:58.7785680Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-09-07T07:54:58.7806827Z Entering 'third_party/FXdiv' 2025-09-07T07:54:58.7848668Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-09-07T07:54:58.7869973Z Entering 'third_party/NNPACK' 2025-09-07T07:54:58.7910589Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-09-07T07:54:58.7932210Z Entering 'third_party/NVTX' 2025-09-07T07:54:58.7973119Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-09-07T07:54:58.7995493Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:54:58.8036614Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-09-07T07:54:58.8058489Z Entering 'third_party/XNNPACK' 2025-09-07T07:54:58.8099339Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-09-07T07:54:58.8133536Z Entering 'third_party/aiter' 2025-09-07T07:54:58.8175432Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-09-07T07:54:58.8197060Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:54:58.8239336Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-09-07T07:54:58.8269920Z Entering 'third_party/benchmark' 2025-09-07T07:54:58.8311161Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:54:58.8333261Z Entering 'third_party/composable_kernel' 2025-09-07T07:54:58.8373371Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-09-07T07:54:58.8402484Z Entering 'third_party/cpp-httplib' 2025-09-07T07:54:58.8443641Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-09-07T07:54:58.8464568Z Entering 'third_party/cpuinfo' 2025-09-07T07:54:58.8505201Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-09-07T07:54:58.8527446Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:54:58.8568538Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-09-07T07:54:58.8589387Z Entering 'third_party/cutlass' 2025-09-07T07:54:58.8630032Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-09-07T07:54:58.8659904Z Entering 'third_party/fbgemm' 2025-09-07T07:54:58.8700709Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-09-07T07:54:58.8723216Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:54:58.8764621Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-09-07T07:54:58.8785270Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:54:58.8824432Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-09-07T07:54:58.8851506Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:54:58.8891580Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-09-07T07:54:58.8912396Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:54:58.8952780Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-09-07T07:54:58.8984835Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:54:58.9025287Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-09-07T07:54:58.9045680Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:54:58.9085673Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-09-07T07:54:58.9105674Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:54:58.9148228Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-09-07T07:54:58.9179005Z Entering 'third_party/flash-attention' 2025-09-07T07:54:58.9228156Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-09-07T07:54:58.9251209Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:54:58.9295587Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-09-07T07:54:58.9321807Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:54:58.9362016Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-09-07T07:54:58.9391728Z Entering 'third_party/flatbuffers' 2025-09-07T07:54:58.9434015Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-09-07T07:54:58.9458805Z Entering 'third_party/fmt' 2025-09-07T07:54:58.9499369Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-09-07T07:54:58.9521255Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:54:58.9562347Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-09-07T07:54:58.9583963Z Entering 'third_party/gloo' 2025-09-07T07:54:58.9624821Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-09-07T07:54:58.9646659Z Entering 'third_party/googletest' 2025-09-07T07:54:58.9690062Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:54:58.9711990Z Entering 'third_party/ideep' 2025-09-07T07:54:58.9752638Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-09-07T07:54:58.9773153Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:54:58.9814008Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-09-07T07:54:58.9843029Z Entering 'third_party/ittapi' 2025-09-07T07:54:58.9885383Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-09-07T07:54:58.9907589Z Entering 'third_party/kineto' 2025-09-07T07:54:58.9948437Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-09-07T07:54:58.9969154Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:54:59.0010853Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-09-07T07:54:59.0031270Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:54:59.0073407Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-09-07T07:54:59.0098055Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:54:59.0142182Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-09-07T07:54:59.0163730Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:54:59.0204186Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-09-07T07:54:59.0225190Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:54:59.0265892Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-09-07T07:54:59.0284690Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:54:59.0327039Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-09-07T07:54:59.0350054Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:54:59.0391433Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-09-07T07:54:59.0411990Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:54:59.0452402Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:54:59.0472997Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:54:59.0513532Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-09-07T07:54:59.0534770Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:54:59.0574931Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-09-07T07:54:59.0599518Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:54:59.0639381Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-09-07T07:54:59.0659705Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:54:59.0699563Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-09-07T07:54:59.0722524Z Entering 'third_party/kleidiai' 2025-09-07T07:54:59.0763584Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-09-07T07:54:59.0785454Z Entering 'third_party/mimalloc' 2025-09-07T07:54:59.0825726Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-09-07T07:54:59.0847387Z Entering 'third_party/nlohmann' 2025-09-07T07:54:59.0887926Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-09-07T07:54:59.0911398Z Entering 'third_party/onnx' 2025-09-07T07:54:59.0952900Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-09-07T07:54:59.0989476Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:54:59.1031683Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:54:59.1057177Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:54:59.1098624Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-09-07T07:54:59.1121157Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:54:59.1162008Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:54:59.1182278Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:54:59.1222297Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:54:59.1242502Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:54:59.1283668Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-09-07T07:54:59.1304011Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:54:59.1345074Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-09-07T07:54:59.1367299Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:54:59.1408378Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-09-07T07:54:59.1428910Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:54:59.1468758Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-09-07T07:54:59.1488925Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:54:59.1529701Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-09-07T07:54:59.1547831Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:54:59.1589392Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-09-07T07:54:59.1611672Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:54:59.1651725Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-09-07T07:54:59.1674868Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:54:59.1714691Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-09-07T07:54:59.1754851Z Entering 'third_party/pocketfft' 2025-09-07T07:54:59.1796610Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-09-07T07:54:59.1817945Z Entering 'third_party/protobuf' 2025-09-07T07:54:59.1858264Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-09-07T07:54:59.1882237Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:54:59.1923279Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:54:59.1943379Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:54:59.1983659Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:54:59.2007638Z Entering 'third_party/psimd' 2025-09-07T07:54:59.2048588Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-09-07T07:54:59.2069940Z Entering 'third_party/pthreadpool' 2025-09-07T07:54:59.2110319Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-09-07T07:54:59.2131524Z Entering 'third_party/pybind11' 2025-09-07T07:54:59.2171606Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:54:59.2193589Z Entering 'third_party/python-peachpy' 2025-09-07T07:54:59.2233656Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-09-07T07:54:59.2256705Z Entering 'third_party/sleef' 2025-09-07T07:54:59.2296943Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-09-07T07:54:59.2318535Z Entering 'third_party/tensorpipe' 2025-09-07T07:54:59.2359297Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-09-07T07:54:59.2379654Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:54:59.2420533Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:54:59.2441192Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:54:59.2480925Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-09-07T07:54:59.2500753Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:54:59.2540096Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-09-07T07:54:59.2560524Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:54:59.2599999Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:54:59.2618695Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:54:59.2660033Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-09-07T07:54:59.2942169Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-09-07T07:54:59.3205265Z Entering 'android/libs/fbjni' 2025-09-07T07:54:59.3245749Z Entering 'third_party/FP16' 2025-09-07T07:54:59.3284559Z Entering 'third_party/FXdiv' 2025-09-07T07:54:59.3323785Z Entering 'third_party/NNPACK' 2025-09-07T07:54:59.3363429Z Entering 'third_party/NVTX' 2025-09-07T07:54:59.3402505Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:54:59.3441347Z Entering 'third_party/XNNPACK' 2025-09-07T07:54:59.3494896Z Entering 'third_party/aiter' 2025-09-07T07:54:59.3537848Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:54:59.3589782Z Entering 'third_party/benchmark' 2025-09-07T07:54:59.3628955Z Entering 'third_party/composable_kernel' 2025-09-07T07:54:59.3676164Z Entering 'third_party/cpp-httplib' 2025-09-07T07:54:59.3714352Z Entering 'third_party/cpuinfo' 2025-09-07T07:54:59.3755117Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:54:59.3798948Z Entering 'third_party/cutlass' 2025-09-07T07:54:59.3851315Z Entering 'third_party/fbgemm' 2025-09-07T07:54:59.3896608Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:54:59.3938559Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:54:59.3987034Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:54:59.4028339Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:54:59.4079037Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:54:59.4119808Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:54:59.4160368Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:54:59.4205691Z Entering 'third_party/flash-attention' 2025-09-07T07:54:59.4248734Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:54:59.4296003Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:54:59.4347395Z Entering 'third_party/flatbuffers' 2025-09-07T07:54:59.4392836Z Entering 'third_party/fmt' 2025-09-07T07:54:59.4435661Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:54:59.4478377Z Entering 'third_party/gloo' 2025-09-07T07:54:59.4521041Z Entering 'third_party/googletest' 2025-09-07T07:54:59.4563984Z Entering 'third_party/ideep' 2025-09-07T07:54:59.4605288Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:54:59.4655178Z Entering 'third_party/ittapi' 2025-09-07T07:54:59.4697844Z Entering 'third_party/kineto' 2025-09-07T07:54:59.4739762Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:54:59.4781063Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:54:59.4824046Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:54:59.4866027Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:54:59.4907775Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:54:59.4947503Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:54:59.4992232Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:54:59.5033731Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:54:59.5076332Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:54:59.5118559Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:54:59.5162215Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:54:59.5203819Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:54:59.5247748Z Entering 'third_party/kleidiai' 2025-09-07T07:54:59.5291365Z Entering 'third_party/mimalloc' 2025-09-07T07:54:59.5334423Z Entering 'third_party/nlohmann' 2025-09-07T07:54:59.5378993Z Entering 'third_party/onnx' 2025-09-07T07:54:59.5436913Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:54:59.5484433Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:54:59.5528844Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:54:59.5570645Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:54:59.5612320Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:54:59.5653202Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:54:59.5695379Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:54:59.5736275Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:54:59.5776829Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:54:59.5815851Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:54:59.5859245Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:54:59.5902670Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:54:59.5963344Z Entering 'third_party/pocketfft' 2025-09-07T07:54:59.6005664Z Entering 'third_party/protobuf' 2025-09-07T07:54:59.6050376Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:54:59.6091704Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:54:59.6136170Z Entering 'third_party/psimd' 2025-09-07T07:54:59.6179057Z Entering 'third_party/pthreadpool' 2025-09-07T07:54:59.6221611Z Entering 'third_party/pybind11' 2025-09-07T07:54:59.6265338Z Entering 'third_party/python-peachpy' 2025-09-07T07:54:59.6308034Z Entering 'third_party/sleef' 2025-09-07T07:54:59.6351510Z Entering 'third_party/tensorpipe' 2025-09-07T07:54:59.6393860Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:54:59.6434553Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:54:59.6475328Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:54:59.6516239Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:54:59.6554898Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:54:59.6618071Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-09-07T07:54:59.6886398Z Entering 'android/libs/fbjni' 2025-09-07T07:54:59.6929207Z Entering 'third_party/FP16' 2025-09-07T07:54:59.6971922Z Entering 'third_party/FXdiv' 2025-09-07T07:54:59.7014522Z Entering 'third_party/NNPACK' 2025-09-07T07:54:59.7057443Z Entering 'third_party/NVTX' 2025-09-07T07:54:59.7100816Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:54:59.7143901Z Entering 'third_party/XNNPACK' 2025-09-07T07:54:59.7201125Z Entering 'third_party/aiter' 2025-09-07T07:54:59.7243802Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:54:59.7295668Z Entering 'third_party/benchmark' 2025-09-07T07:54:59.7339105Z Entering 'third_party/composable_kernel' 2025-09-07T07:54:59.7389688Z Entering 'third_party/cpp-httplib' 2025-09-07T07:54:59.7432267Z Entering 'third_party/cpuinfo' 2025-09-07T07:54:59.7475791Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:54:59.7519400Z Entering 'third_party/cutlass' 2025-09-07T07:54:59.7571328Z Entering 'third_party/fbgemm' 2025-09-07T07:54:59.7616758Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:54:59.7658768Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:54:59.7706978Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:54:59.7748173Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:54:59.7797580Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:54:59.7838700Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:54:59.7879609Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:54:59.7924140Z Entering 'third_party/flash-attention' 2025-09-07T07:54:59.7967661Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:54:59.8014131Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:54:59.8065745Z Entering 'third_party/flatbuffers' 2025-09-07T07:54:59.8111782Z Entering 'third_party/fmt' 2025-09-07T07:54:59.8154461Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:54:59.8197593Z Entering 'third_party/gloo' 2025-09-07T07:54:59.8240488Z Entering 'third_party/googletest' 2025-09-07T07:54:59.8283290Z Entering 'third_party/ideep' 2025-09-07T07:54:59.8324657Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:54:59.8376211Z Entering 'third_party/ittapi' 2025-09-07T07:54:59.8419776Z Entering 'third_party/kineto' 2025-09-07T07:54:59.8462298Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:54:59.8503116Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:54:59.8546151Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:54:59.8587449Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:54:59.8629230Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:54:59.8669183Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:54:59.8714286Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:54:59.8756284Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:54:59.8797750Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:54:59.8840200Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:54:59.8886065Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:54:59.8926815Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:54:59.8971284Z Entering 'third_party/kleidiai' 2025-09-07T07:54:59.9015841Z Entering 'third_party/mimalloc' 2025-09-07T07:54:59.9059045Z Entering 'third_party/nlohmann' 2025-09-07T07:54:59.9102923Z Entering 'third_party/onnx' 2025-09-07T07:54:59.9159834Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:54:59.9206554Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:54:59.9250197Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:54:59.9290594Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:54:59.9331427Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:54:59.9371752Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:54:59.9415479Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:54:59.9456097Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:54:59.9496694Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:54:59.9535495Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:54:59.9579059Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:54:59.9622810Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:54:59.9682821Z Entering 'third_party/pocketfft' 2025-09-07T07:54:59.9725793Z Entering 'third_party/protobuf' 2025-09-07T07:54:59.9770241Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:54:59.9811768Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:54:59.9857067Z Entering 'third_party/psimd' 2025-09-07T07:54:59.9899885Z Entering 'third_party/pthreadpool' 2025-09-07T07:54:59.9943013Z Entering 'third_party/pybind11' 2025-09-07T07:54:59.9986258Z Entering 'third_party/python-peachpy' 2025-09-07T07:55:00.0029473Z Entering 'third_party/sleef' 2025-09-07T07:55:00.0071765Z Entering 'third_party/tensorpipe' 2025-09-07T07:55:00.0114333Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:55:00.0155061Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:55:00.0195645Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:55:00.0236558Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:55:00.0276413Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:55:00.0335997Z ##[endgroup] 2025-09-07T07:55:00.0379727Z [command]/usr/bin/git log -1 --format=%H 2025-09-07T07:55:00.0408802Z 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:55:00.0581924Z ##[group]Run actions/checkout@v4 2025-09-07T07:55:00.0582148Z with: 2025-09-07T07:55:00.0582341Z ref: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:55:00.0582585Z fetch-depth: 0 2025-09-07T07:55:00.0582760Z submodules: recursive 2025-09-07T07:55:00.0582954Z show-progress: false 2025-09-07T07:55:00.0583186Z repository: pytorch/pytorch 2025-09-07T07:55:00.0583539Z token: *** 2025-09-07T07:55:00.0583708Z ssh-strict: true 2025-09-07T07:55:00.0583884Z ssh-user: git 2025-09-07T07:55:00.0584079Z persist-credentials: true 2025-09-07T07:55:00.0584295Z clean: true 2025-09-07T07:55:00.0584473Z sparse-checkout-cone-mode: true 2025-09-07T07:55:00.0584692Z fetch-tags: false 2025-09-07T07:55:00.0584861Z lfs: false 2025-09-07T07:55:00.0585199Z set-safe-directory: true 2025-09-07T07:55:00.0585392Z env: 2025-09-07T07:55:00.0585558Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:00.0585746Z ##[endgroup] 2025-09-07T07:55:00.1469150Z Syncing repository: pytorch/pytorch 2025-09-07T07:55:00.1472018Z ##[group]Getting Git version info 2025-09-07T07:55:00.1472410Z Working directory is '/home/david/_work/pytorch/pytorch' 2025-09-07T07:55:00.1505534Z [command]/usr/bin/git version 2025-09-07T07:55:00.1542259Z git version 2.50.1 2025-09-07T07:55:00.1565301Z ##[endgroup] 2025-09-07T07:55:00.1576591Z Temporarily overriding HOME='/home/david/_work/_temp/a2f0442b-572f-4fbc-99e1-d67cdc4e322b' before making global git config changes 2025-09-07T07:55:00.1577283Z Adding repository directory to the temporary git global config as a safe directory 2025-09-07T07:55:00.1581390Z [command]/usr/bin/git config --global --add safe.directory /home/david/_work/pytorch/pytorch 2025-09-07T07:55:00.1618014Z [command]/usr/bin/git config --local --get remote.origin.url 2025-09-07T07:55:00.1640828Z https://github.com/pytorch/pytorch 2025-09-07T07:55:00.1654261Z ##[group]Removing previously created refs, to avoid conflicts 2025-09-07T07:55:00.1657586Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-09-07T07:55:00.1679154Z HEAD 2025-09-07T07:55:00.1717503Z ##[endgroup] 2025-09-07T07:55:00.1720692Z [command]/usr/bin/git submodule status 2025-09-07T07:55:00.2030239Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-09-07T07:55:00.2115621Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-09-07T07:55:00.2200380Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-09-07T07:55:00.2298308Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-09-07T07:55:00.2352648Z 2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07 third_party/NVTX (v3.1.0-263-g2942f16) 2025-09-07T07:55:00.2436166Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-09-07T07:55:00.2898715Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-09-07T07:55:00.2939601Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-09-07T07:55:00.2968857Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-09-07T07:55:00.3048650Z 7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d) 2025-09-07T07:55:00.3178487Z 89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0) 2025-09-07T07:55:00.3299189Z 5e3d2445e6a84d9599bee2bf78edbb4d80865e1d third_party/cpuinfo (5e3d244) 2025-09-07T07:55:00.3339465Z f937055efc6d414d11f4c6577e3977fe74f35fb6 third_party/cudnn_frontend (v0.5-52-gf937055) 2025-09-07T07:55:00.3437540Z e51efbfe18fe4f4cbb66ab814c55bf4aa0185491 third_party/cutlass (v4.1.0) 2025-09-07T07:55:00.3497916Z 4b39c551efe15e6bbade20565b0ceb2d8ce3352d third_party/fbgemm (v1.3.0-rc1-342-g4b39c551) 2025-09-07T07:55:00.3586156Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-09-07T07:55:00.3615693Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-09-07T07:55:00.3982441Z 40626af88bd7df9a5fb80be7b25ac85b122d6c21 third_party/fmt (11.2.0) 2025-09-07T07:55:00.4094335Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-09-07T07:55:00.4221827Z c7b7b022c124d9643957d9bd55f57ac59fce8fa2 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-33-gc7b7b02) 2025-09-07T07:55:00.4434692Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-09-07T07:55:00.4518606Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-09-07T07:55:00.4584874Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-09-07T07:55:00.4830818Z 5e7501833f1021ce6f618572d3baf657b6319658 third_party/kineto (remotes/origin/sraikund/test-98-g5e75018) 2025-09-07T07:55:00.4861034Z cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7 third_party/kleidiai (v1.8.0) 2025-09-07T07:55:00.4890366Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-09-07T07:55:00.4919076Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-09-07T07:55:00.5221885Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-09-07T07:55:00.5250534Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-09-07T07:55:00.5281439Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-09-07T07:55:00.5606466Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-09-07T07:55:00.5690105Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-09-07T07:55:00.5750457Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-09-07T07:55:00.5779445Z f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1) 2025-09-07T07:55:00.5861828Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-09-07T07:55:00.5942671Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-09-07T07:55:00.6022021Z af0118d13e52f5a08841464a768e01a0bf3e3075 third_party/tensorpipe (heads/main) 2025-09-07T07:55:00.6035484Z ##[group]Cleaning the repository 2025-09-07T07:55:00.6039820Z [command]/usr/bin/git clean -ffdx 2025-09-07T07:55:00.6372211Z [command]/usr/bin/git reset --hard HEAD 2025-09-07T07:55:01.0844518Z HEAD is now at 93fb23d6fae Build vLLM nightly wheels (#162000) 2025-09-07T07:55:01.0873656Z ##[endgroup] 2025-09-07T07:55:01.0875414Z ##[group]Disabling automatic garbage collection 2025-09-07T07:55:01.0880172Z [command]/usr/bin/git config --local gc.auto 0 2025-09-07T07:55:01.0914170Z ##[endgroup] 2025-09-07T07:55:01.0914498Z ##[group]Setting up auth 2025-09-07T07:55:01.0920173Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-09-07T07:55:01.0953040Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-09-07T07:55:01.1220845Z Entering 'android/libs/fbjni' 2025-09-07T07:55:01.1270262Z Entering 'third_party/FP16' 2025-09-07T07:55:01.1319361Z Entering 'third_party/FXdiv' 2025-09-07T07:55:01.1368209Z Entering 'third_party/NNPACK' 2025-09-07T07:55:01.1417496Z Entering 'third_party/NVTX' 2025-09-07T07:55:01.1466923Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:55:01.1516194Z Entering 'third_party/XNNPACK' 2025-09-07T07:55:01.1578843Z Entering 'third_party/aiter' 2025-09-07T07:55:01.1627925Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:55:01.1685444Z Entering 'third_party/benchmark' 2025-09-07T07:55:01.1737031Z Entering 'third_party/composable_kernel' 2025-09-07T07:55:01.1794365Z Entering 'third_party/cpp-httplib' 2025-09-07T07:55:01.1843042Z Entering 'third_party/cpuinfo' 2025-09-07T07:55:01.1892493Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:55:01.1944125Z Entering 'third_party/cutlass' 2025-09-07T07:55:01.2001687Z Entering 'third_party/fbgemm' 2025-09-07T07:55:01.2053545Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:55:01.2102083Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:55:01.2154741Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:55:01.2201168Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:55:01.2255610Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:55:01.2302200Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:55:01.2348624Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:55:01.2399198Z Entering 'third_party/flash-attention' 2025-09-07T07:55:01.2449174Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:55:01.2501447Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:55:01.2557416Z Entering 'third_party/flatbuffers' 2025-09-07T07:55:01.2610039Z Entering 'third_party/fmt' 2025-09-07T07:55:01.2659177Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:55:01.2708255Z Entering 'third_party/gloo' 2025-09-07T07:55:01.2757038Z Entering 'third_party/googletest' 2025-09-07T07:55:01.2806674Z Entering 'third_party/ideep' 2025-09-07T07:55:01.2854015Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:55:01.2909295Z Entering 'third_party/ittapi' 2025-09-07T07:55:01.2959508Z Entering 'third_party/kineto' 2025-09-07T07:55:01.3007257Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:55:01.3054499Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:55:01.3104773Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:55:01.3153306Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:55:01.3201673Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:55:01.3247834Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:55:01.3297782Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:55:01.3346069Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:55:01.3394475Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:55:01.3444243Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:55:01.3496527Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:55:01.3542519Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:55:01.3591105Z Entering 'third_party/kleidiai' 2025-09-07T07:55:01.3640964Z Entering 'third_party/mimalloc' 2025-09-07T07:55:01.3690172Z Entering 'third_party/nlohmann' 2025-09-07T07:55:01.3741809Z Entering 'third_party/onnx' 2025-09-07T07:55:01.3804778Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:55:01.3857720Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:55:01.3909716Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:55:01.3957203Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:55:01.4002989Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:55:01.4051266Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:55:01.4098831Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:55:01.4144772Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:55:01.4190738Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:55:01.4235744Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:55:01.4286884Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:55:01.4339239Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:55:01.4404549Z Entering 'third_party/pocketfft' 2025-09-07T07:55:01.4454694Z Entering 'third_party/protobuf' 2025-09-07T07:55:01.4504880Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:55:01.4551522Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:55:01.4603929Z Entering 'third_party/psimd' 2025-09-07T07:55:01.4653326Z Entering 'third_party/pthreadpool' 2025-09-07T07:55:01.4702082Z Entering 'third_party/pybind11' 2025-09-07T07:55:01.4751249Z Entering 'third_party/python-peachpy' 2025-09-07T07:55:01.4799854Z Entering 'third_party/sleef' 2025-09-07T07:55:01.4849869Z Entering 'third_party/tensorpipe' 2025-09-07T07:55:01.4898144Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:55:01.4945975Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:55:01.4991967Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:55:01.5037855Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:55:01.5082307Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:55:01.5154105Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-09-07T07:55:01.5178287Z http.https://github.com/.extraheader 2025-09-07T07:55:01.5185584Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-09-07T07:55:01.5218196Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-09-07T07:55:01.5475446Z Entering 'android/libs/fbjni' 2025-09-07T07:55:01.5502158Z http.https://github.com/.extraheader 2025-09-07T07:55:01.5538337Z Entering 'third_party/FP16' 2025-09-07T07:55:01.5566835Z http.https://github.com/.extraheader 2025-09-07T07:55:01.5602726Z Entering 'third_party/FXdiv' 2025-09-07T07:55:01.5630722Z http.https://github.com/.extraheader 2025-09-07T07:55:01.5665612Z Entering 'third_party/NNPACK' 2025-09-07T07:55:01.5692913Z http.https://github.com/.extraheader 2025-09-07T07:55:01.5728249Z Entering 'third_party/NVTX' 2025-09-07T07:55:01.5755642Z http.https://github.com/.extraheader 2025-09-07T07:55:01.5791228Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:55:01.5818982Z http.https://github.com/.extraheader 2025-09-07T07:55:01.5853801Z Entering 'third_party/XNNPACK' 2025-09-07T07:55:01.5882943Z http.https://github.com/.extraheader 2025-09-07T07:55:01.5933580Z Entering 'third_party/aiter' 2025-09-07T07:55:01.5961783Z http.https://github.com/.extraheader 2025-09-07T07:55:01.5996911Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:55:01.6024323Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6070088Z Entering 'third_party/benchmark' 2025-09-07T07:55:01.6098136Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6133798Z Entering 'third_party/composable_kernel' 2025-09-07T07:55:01.6160979Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6204317Z Entering 'third_party/cpp-httplib' 2025-09-07T07:55:01.6232070Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6266963Z Entering 'third_party/cpuinfo' 2025-09-07T07:55:01.6293724Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6328822Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:55:01.6357203Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6392930Z Entering 'third_party/cutlass' 2025-09-07T07:55:01.6420893Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6463931Z Entering 'third_party/fbgemm' 2025-09-07T07:55:01.6491471Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6529252Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:55:01.6556023Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6590818Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:55:01.6617136Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6657484Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:55:01.6683521Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6718190Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:55:01.6743738Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6786464Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:55:01.6812128Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6845946Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:55:01.6871647Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6904870Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:55:01.6930767Z http.https://github.com/.extraheader 2025-09-07T07:55:01.6968434Z Entering 'third_party/flash-attention' 2025-09-07T07:55:01.6996312Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7031196Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:55:01.7057965Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7098025Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:55:01.7123881Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7167527Z Entering 'third_party/flatbuffers' 2025-09-07T07:55:01.7196021Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7233213Z Entering 'third_party/fmt' 2025-09-07T07:55:01.7260598Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7295195Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:55:01.7322287Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7358112Z Entering 'third_party/gloo' 2025-09-07T07:55:01.7385988Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7420587Z Entering 'third_party/googletest' 2025-09-07T07:55:01.7448330Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7483123Z Entering 'third_party/ideep' 2025-09-07T07:55:01.7510084Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7543366Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:55:01.7569769Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7612279Z Entering 'third_party/ittapi' 2025-09-07T07:55:01.7640500Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7674702Z Entering 'third_party/kineto' 2025-09-07T07:55:01.7702415Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7736407Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:55:01.7763164Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7796504Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:55:01.7825090Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7860412Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:55:01.7888909Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7925333Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:55:01.7954917Z http.https://github.com/.extraheader 2025-09-07T07:55:01.7992900Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:55:01.8020650Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8054419Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:55:01.8082100Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8120084Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:55:01.8146313Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8181347Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:55:01.8208781Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8244633Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:55:01.8272788Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8309741Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:55:01.8335870Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8374277Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:55:01.8399777Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8433135Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:55:01.8458750Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8495301Z Entering 'third_party/kleidiai' 2025-09-07T07:55:01.8524475Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8560323Z Entering 'third_party/mimalloc' 2025-09-07T07:55:01.8587325Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8621708Z Entering 'third_party/nlohmann' 2025-09-07T07:55:01.8648986Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8685238Z Entering 'third_party/onnx' 2025-09-07T07:55:01.8712321Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8761685Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:55:01.8789566Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8828949Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:55:01.8857061Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8892846Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:55:01.8920223Z http.https://github.com/.extraheader 2025-09-07T07:55:01.8968817Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:55:01.8994426Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9028205Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:55:01.9054067Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9087692Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:55:01.9113386Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9148749Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:55:01.9174859Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9208812Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:55:01.9234482Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9268764Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:55:01.9294868Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9328057Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:55:01.9355607Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9392331Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:55:01.9419755Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9457113Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:55:01.9483335Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9536394Z Entering 'third_party/pocketfft' 2025-09-07T07:55:01.9564053Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9599265Z Entering 'third_party/protobuf' 2025-09-07T07:55:01.9629282Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9667678Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:55:01.9694601Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9729000Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:55:01.9754867Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9795106Z Entering 'third_party/psimd' 2025-09-07T07:55:01.9823038Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9858059Z Entering 'third_party/pthreadpool' 2025-09-07T07:55:01.9885127Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9921067Z Entering 'third_party/pybind11' 2025-09-07T07:55:01.9948782Z http.https://github.com/.extraheader 2025-09-07T07:55:01.9984155Z Entering 'third_party/python-peachpy' 2025-09-07T07:55:02.0010430Z http.https://github.com/.extraheader 2025-09-07T07:55:02.0045929Z Entering 'third_party/sleef' 2025-09-07T07:55:02.0073220Z http.https://github.com/.extraheader 2025-09-07T07:55:02.0109354Z Entering 'third_party/tensorpipe' 2025-09-07T07:55:02.0137239Z http.https://github.com/.extraheader 2025-09-07T07:55:02.0172064Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:55:02.0197693Z http.https://github.com/.extraheader 2025-09-07T07:55:02.0232652Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:55:02.0258053Z http.https://github.com/.extraheader 2025-09-07T07:55:02.0291158Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:55:02.0316706Z http.https://github.com/.extraheader 2025-09-07T07:55:02.0349953Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:55:02.0374877Z http.https://github.com/.extraheader 2025-09-07T07:55:02.0406926Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:55:02.0432461Z http.https://github.com/.extraheader 2025-09-07T07:55:02.0492915Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-09-07T07:55:02.0531523Z ##[endgroup] 2025-09-07T07:55:02.0531852Z ##[group]Fetching the repository 2025-09-07T07:55:02.0538514Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-09-07T07:55:02.6567487Z [command]/usr/bin/git rev-parse --verify --quiet 93fb23d6fae7c4e82c4239a1033e522088742634^{object} 2025-09-07T07:55:02.6595796Z 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:55:02.6601308Z ##[endgroup] 2025-09-07T07:55:02.6601821Z ##[group]Determining the checkout info 2025-09-07T07:55:02.6602400Z ##[endgroup] 2025-09-07T07:55:02.6605816Z [command]/usr/bin/git sparse-checkout disable 2025-09-07T07:55:02.6795129Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-09-07T07:55:02.6823621Z ##[group]Checking out the ref 2025-09-07T07:55:02.6827852Z [command]/usr/bin/git checkout --progress --force 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:55:02.7156827Z HEAD is now at 93fb23d6fae Build vLLM nightly wheels (#162000) 2025-09-07T07:55:02.7167733Z ##[endgroup] 2025-09-07T07:55:02.7168095Z ##[group]Setting up auth for fetching submodules 2025-09-07T07:55:02.7171684Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-09-07T07:55:02.7206301Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-09-07T07:55:02.7235247Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-09-07T07:55:02.7264129Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-09-07T07:55:02.7290720Z ##[endgroup] 2025-09-07T07:55:02.7291029Z ##[group]Fetching submodules 2025-09-07T07:55:02.7293668Z [command]/usr/bin/git submodule sync --recursive 2025-09-07T07:55:02.7569619Z Synchronizing submodule url for 'android/libs/fbjni' 2025-09-07T07:55:02.7594597Z Synchronizing submodule url for 'third_party/FP16' 2025-09-07T07:55:02.7618773Z Synchronizing submodule url for 'third_party/FXdiv' 2025-09-07T07:55:02.7642808Z Synchronizing submodule url for 'third_party/NNPACK' 2025-09-07T07:55:02.7666884Z Synchronizing submodule url for 'third_party/NVTX' 2025-09-07T07:55:02.7691350Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-09-07T07:55:02.7715636Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-09-07T07:55:02.7753645Z Synchronizing submodule url for 'third_party/aiter' 2025-09-07T07:55:02.7777390Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:55:02.7810576Z Synchronizing submodule url for 'third_party/benchmark' 2025-09-07T07:55:02.7834346Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-09-07T07:55:02.7867226Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-09-07T07:55:02.7891480Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-09-07T07:55:02.7916096Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-09-07T07:55:02.7940357Z Synchronizing submodule url for 'third_party/cutlass' 2025-09-07T07:55:02.7972601Z Synchronizing submodule url for 'third_party/fbgemm' 2025-09-07T07:55:02.7997523Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-09-07T07:55:02.8020856Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:55:02.8051279Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:55:02.8073088Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-09-07T07:55:02.8106549Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-09-07T07:55:02.8128476Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:55:02.8149877Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-09-07T07:55:02.8176372Z Synchronizing submodule url for 'third_party/flash-attention' 2025-09-07T07:55:02.8199557Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:55:02.8226836Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:55:02.8257950Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-09-07T07:55:02.8284402Z Synchronizing submodule url for 'third_party/fmt' 2025-09-07T07:55:02.8308898Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:55:02.8332749Z Synchronizing submodule url for 'third_party/gloo' 2025-09-07T07:55:02.8356896Z Synchronizing submodule url for 'third_party/googletest' 2025-09-07T07:55:02.8380571Z Synchronizing submodule url for 'third_party/ideep' 2025-09-07T07:55:02.8401803Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-09-07T07:55:02.8432867Z Synchronizing submodule url for 'third_party/ittapi' 2025-09-07T07:55:02.8456951Z Synchronizing submodule url for 'third_party/kineto' 2025-09-07T07:55:02.8479317Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:55:02.8502345Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:55:02.8527378Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:55:02.8550056Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:55:02.8572959Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:55:02.8597847Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:55:02.8624270Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:55:02.8647919Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:55:02.8670466Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:55:02.8693921Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:55:02.8718389Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:55:02.8741559Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:55:02.8765353Z Synchronizing submodule url for 'third_party/kleidiai' 2025-09-07T07:55:02.8789755Z Synchronizing submodule url for 'third_party/mimalloc' 2025-09-07T07:55:02.8813496Z Synchronizing submodule url for 'third_party/nlohmann' 2025-09-07T07:55:02.8838931Z Synchronizing submodule url for 'third_party/onnx' 2025-09-07T07:55:02.8875592Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-09-07T07:55:02.8908561Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-09-07T07:55:02.8932070Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:55:02.8954646Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:55:02.8977429Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:55:02.9000592Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:55:02.9023555Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:55:02.9044599Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:55:02.9069114Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:55:02.9089460Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:55:02.9112936Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:55:02.9136086Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:55:02.9183431Z Synchronizing submodule url for 'third_party/pocketfft' 2025-09-07T07:55:02.9207635Z Synchronizing submodule url for 'third_party/protobuf' 2025-09-07T07:55:02.9233012Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:55:02.9257616Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-09-07T07:55:02.9284720Z Synchronizing submodule url for 'third_party/psimd' 2025-09-07T07:55:02.9309121Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-09-07T07:55:02.9332505Z Synchronizing submodule url for 'third_party/pybind11' 2025-09-07T07:55:02.9356593Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-09-07T07:55:02.9380295Z Synchronizing submodule url for 'third_party/sleef' 2025-09-07T07:55:02.9404388Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-09-07T07:55:02.9426911Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:55:02.9448490Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:55:02.9471874Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:55:02.9493780Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:55:02.9518486Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:55:02.9558138Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-09-07T07:55:02.9956602Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-09-07T07:55:03.0082476Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-09-07T07:55:03.0183680Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-09-07T07:55:03.0410658Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-09-07T07:55:03.1233092Z Submodule path 'third_party/NVTX': checked out '2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07' 2025-09-07T07:55:03.1576759Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-09-07T07:55:03.4168508Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-09-07T07:55:03.6186914Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-09-07T07:55:03.8938118Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-09-07T07:55:03.9140194Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-09-07T07:55:04.2324243Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-09-07T07:55:04.2875655Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-09-07T07:55:04.3991773Z Submodule path 'third_party/cpuinfo': checked out '5e3d2445e6a84d9599bee2bf78edbb4d80865e1d' 2025-09-07T07:55:04.4437093Z Submodule path 'third_party/cudnn_frontend': checked out 'f937055efc6d414d11f4c6577e3977fe74f35fb6' 2025-09-07T07:55:05.2460087Z Submodule path 'third_party/cutlass': checked out 'e51efbfe18fe4f4cbb66ab814c55bf4aa0185491' 2025-09-07T07:55:05.3767285Z Submodule path 'third_party/fbgemm': checked out '4b39c551efe15e6bbade20565b0ceb2d8ce3352d' 2025-09-07T07:55:05.4355371Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-09-07T07:55:05.7052157Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out 'b1281b8b08d973a7064f864f47eeb30f3e2596e9' 2025-09-07T07:55:05.8257678Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-09-07T07:55:06.0424003Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '311f3c8e51dc0eb56310cfc6980bf63d0fbd7917' 2025-09-07T07:55:06.0792331Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-09-07T07:55:06.0927636Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-09-07T07:55:06.1935483Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-09-07T07:55:06.2693812Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-09-07T07:55:06.5434088Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-09-07T07:55:06.7620515Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-09-07T07:55:06.9829256Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-09-07T07:55:07.0113906Z Submodule path 'third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-09-07T07:55:07.0496736Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-09-07T07:55:07.0717208Z Submodule path 'third_party/gloo': checked out 'c7b7b022c124d9643957d9bd55f57ac59fce8fa2' 2025-09-07T07:55:07.1067254Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-09-07T07:55:07.1206865Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-09-07T07:55:07.8497728Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-09-07T07:55:07.8713843Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-09-07T07:55:07.9566445Z Submodule path 'third_party/kineto': checked out '5e7501833f1021ce6f618572d3baf657b6319658' 2025-09-07T07:55:08.0969328Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e' 2025-09-07T07:55:08.2714667Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-09-07T07:55:08.2883518Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-09-07T07:55:08.3187493Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-09-07T07:55:08.3334164Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-09-07T07:55:08.3438368Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-09-07T07:55:08.3603300Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-09-07T07:55:08.3944662Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850' 2025-09-07T07:55:08.4794486Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-09-07T07:55:08.4958528Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-09-07T07:55:08.5236610Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164' 2025-09-07T07:55:08.5576810Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2025-09-07T07:55:08.6004811Z Submodule path 'third_party/kleidiai': checked out 'cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7' 2025-09-07T07:55:08.6508461Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-09-07T07:55:08.7618065Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-09-07T07:55:08.9177630Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-09-07T07:55:08.9512648Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-09-07T07:55:09.0139481Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-09-07T07:55:09.0318570Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-09-07T07:55:09.0655561Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-09-07T07:55:09.0784441Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-09-07T07:55:09.1786848Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-09-07T07:55:09.1938032Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-09-07T07:55:09.2081365Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-09-07T07:55:09.2223524Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-09-07T07:55:09.4914867Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-09-07T07:55:09.5272135Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-09-07T07:55:09.6587057Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-09-07T07:55:09.6712429Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-09-07T07:55:09.9113383Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-09-07T07:55:09.9256864Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-09-07T07:55:09.9650979Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-09-07T07:55:09.9754663Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-09-07T07:55:09.9903430Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-09-07T07:55:10.0082989Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-09-07T07:55:10.0508408Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-09-07T07:55:10.0981065Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-09-07T07:55:10.1202474Z Submodule path 'third_party/tensorpipe': checked out 'af0118d13e52f5a08841464a768e01a0bf3e3075' 2025-09-07T07:55:10.1543856Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-09-07T07:55:10.1694180Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-09-07T07:55:10.2197242Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-09-07T07:55:10.2433734Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-09-07T07:55:10.2535832Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-09-07T07:55:10.2593382Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-09-07T07:55:10.2855854Z Entering 'android/libs/fbjni' 2025-09-07T07:55:10.2896217Z Entering 'third_party/FP16' 2025-09-07T07:55:10.2935806Z Entering 'third_party/FXdiv' 2025-09-07T07:55:10.2974254Z Entering 'third_party/NNPACK' 2025-09-07T07:55:10.3012721Z Entering 'third_party/NVTX' 2025-09-07T07:55:10.3051793Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:55:10.3090560Z Entering 'third_party/XNNPACK' 2025-09-07T07:55:10.3149592Z Entering 'third_party/aiter' 2025-09-07T07:55:10.3191088Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:55:10.3237940Z Entering 'third_party/benchmark' 2025-09-07T07:55:10.3278460Z Entering 'third_party/composable_kernel' 2025-09-07T07:55:10.3325589Z Entering 'third_party/cpp-httplib' 2025-09-07T07:55:10.3363721Z Entering 'third_party/cpuinfo' 2025-09-07T07:55:10.3402537Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:55:10.3440965Z Entering 'third_party/cutlass' 2025-09-07T07:55:10.3488497Z Entering 'third_party/fbgemm' 2025-09-07T07:55:10.3528786Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:55:10.3566156Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:55:10.3610188Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:55:10.3647600Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:55:10.3693372Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:55:10.3730759Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:55:10.3767977Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:55:10.3808004Z Entering 'third_party/flash-attention' 2025-09-07T07:55:10.3846563Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:55:10.3889696Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:55:10.3936204Z Entering 'third_party/flatbuffers' 2025-09-07T07:55:10.3977463Z Entering 'third_party/fmt' 2025-09-07T07:55:10.4015993Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:55:10.4054780Z Entering 'third_party/gloo' 2025-09-07T07:55:10.4093008Z Entering 'third_party/googletest' 2025-09-07T07:55:10.4131022Z Entering 'third_party/ideep' 2025-09-07T07:55:10.4167441Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:55:10.4212705Z Entering 'third_party/ittapi' 2025-09-07T07:55:10.4251474Z Entering 'third_party/kineto' 2025-09-07T07:55:10.4288908Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:55:10.4325489Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:55:10.4364260Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:55:10.4401855Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:55:10.4440619Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:55:10.4476085Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:55:10.4516701Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:55:10.4554140Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:55:10.4597728Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:55:10.4638734Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:55:10.4685288Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:55:10.4727752Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:55:10.4772019Z Entering 'third_party/kleidiai' 2025-09-07T07:55:10.4816292Z Entering 'third_party/mimalloc' 2025-09-07T07:55:10.4860437Z Entering 'third_party/nlohmann' 2025-09-07T07:55:10.4906347Z Entering 'third_party/onnx' 2025-09-07T07:55:10.4960127Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:55:10.5003592Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:55:10.5043578Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:55:10.5081043Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:55:10.5120011Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:55:10.5162452Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:55:10.5205381Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:55:10.5246415Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:55:10.5283473Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:55:10.5319477Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:55:10.5358838Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:55:10.5402416Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:55:10.5465217Z Entering 'third_party/pocketfft' 2025-09-07T07:55:10.5509422Z Entering 'third_party/protobuf' 2025-09-07T07:55:10.5555896Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:55:10.5599435Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:55:10.5642057Z Entering 'third_party/psimd' 2025-09-07T07:55:10.5685391Z Entering 'third_party/pthreadpool' 2025-09-07T07:55:10.5732239Z Entering 'third_party/pybind11' 2025-09-07T07:55:10.5777972Z Entering 'third_party/python-peachpy' 2025-09-07T07:55:10.5822238Z Entering 'third_party/sleef' 2025-09-07T07:55:10.5865878Z Entering 'third_party/tensorpipe' 2025-09-07T07:55:10.5912964Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:55:10.5956544Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:55:10.5999080Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:55:10.6041374Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:55:10.6081827Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:55:10.6146195Z ##[endgroup] 2025-09-07T07:55:10.6146643Z ##[group]Persisting credentials for submodules 2025-09-07T07:55:10.6155659Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-09-07T07:55:10.6427888Z Entering 'android/libs/fbjni' 2025-09-07T07:55:10.6455786Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6456089Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6493349Z Entering 'third_party/FP16' 2025-09-07T07:55:10.6519608Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6519900Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6554053Z Entering 'third_party/FXdiv' 2025-09-07T07:55:10.6580468Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6580757Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6613336Z Entering 'third_party/NNPACK' 2025-09-07T07:55:10.6638686Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6657775Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6672282Z Entering 'third_party/NVTX' 2025-09-07T07:55:10.6698148Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6698637Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6732004Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:55:10.6766126Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6766605Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6806314Z Entering 'third_party/XNNPACK' 2025-09-07T07:55:10.6833959Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6834420Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6882202Z Entering 'third_party/aiter' 2025-09-07T07:55:10.6910606Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6911100Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6951584Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:55:10.6977751Z url.https://github.com/.insteadof 2025-09-07T07:55:10.6978188Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7033481Z Entering 'third_party/benchmark' 2025-09-07T07:55:10.7060071Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7061084Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7094285Z Entering 'third_party/composable_kernel' 2025-09-07T07:55:10.7121322Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7121756Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7165156Z Entering 'third_party/cpp-httplib' 2025-09-07T07:55:10.7190540Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7190970Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7231898Z Entering 'third_party/cpuinfo' 2025-09-07T07:55:10.7267319Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7267737Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7306722Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:55:10.7332374Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7332797Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7366482Z Entering 'third_party/cutlass' 2025-09-07T07:55:10.7391873Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7392284Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7436371Z Entering 'third_party/fbgemm' 2025-09-07T07:55:10.7466692Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7467016Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7506124Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:55:10.7533504Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7533800Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7573222Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:55:10.7601431Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7601743Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7647501Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:55:10.7674373Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7674687Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7709553Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:55:10.7736770Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7737073Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7781153Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:55:10.7807242Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7807576Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7841148Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:55:10.7865818Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7866122Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7898608Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:55:10.7923314Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7923628Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7960010Z Entering 'third_party/flash-attention' 2025-09-07T07:55:10.7986275Z url.https://github.com/.insteadof 2025-09-07T07:55:10.7986586Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8019020Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:55:10.8043459Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8043788Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8082575Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:55:10.8107094Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8107400Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8149979Z Entering 'third_party/flatbuffers' 2025-09-07T07:55:10.8174739Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8175195Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8210684Z Entering 'third_party/fmt' 2025-09-07T07:55:10.8235865Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8236347Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8268741Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:55:10.8293736Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8294238Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8326394Z Entering 'third_party/gloo' 2025-09-07T07:55:10.8351285Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8354608Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8383893Z Entering 'third_party/googletest' 2025-09-07T07:55:10.8408556Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8409768Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8440626Z Entering 'third_party/ideep' 2025-09-07T07:55:10.8474710Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8475435Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8512177Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:55:10.8538510Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8538973Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8588889Z Entering 'third_party/ittapi' 2025-09-07T07:55:10.8616647Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8617066Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8652797Z Entering 'third_party/kineto' 2025-09-07T07:55:10.8679850Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8680287Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8714872Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:55:10.8739750Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8740135Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8774090Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:55:10.8799370Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8799788Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8835800Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:55:10.8860574Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8860982Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8895502Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:55:10.8920396Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8920858Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8954548Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:55:10.8979554Z url.https://github.com/.insteadof 2025-09-07T07:55:10.8979960Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9012591Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:55:10.9038032Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9038444Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9075164Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:55:10.9100253Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9100566Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9134600Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:55:10.9159442Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9159741Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9194561Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:55:10.9219136Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9219433Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9254390Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:55:10.9279125Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9279443Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9315328Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:55:10.9340481Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9340771Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9376377Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:55:10.9401825Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9402162Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9440972Z Entering 'third_party/kleidiai' 2025-09-07T07:55:10.9469846Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9470184Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9506714Z Entering 'third_party/mimalloc' 2025-09-07T07:55:10.9533582Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9533868Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9573174Z Entering 'third_party/nlohmann' 2025-09-07T07:55:10.9602398Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9602696Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9651329Z Entering 'third_party/onnx' 2025-09-07T07:55:10.9686284Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9686597Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9749385Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:55:10.9780176Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9780462Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9824131Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:55:10.9857905Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9858262Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9899775Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:55:10.9928497Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9928839Z url.https://github.com/.insteadof 2025-09-07T07:55:10.9970915Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:55:11.0001467Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0001790Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0041184Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:55:11.0071066Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0071361Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0108991Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:55:11.0139245Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0139615Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0180002Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:55:11.0208213Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0208713Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0252212Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:55:11.0279389Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0279673Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0321900Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:55:11.0352923Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0353254Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0392139Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:55:11.0428954Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0429477Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0478606Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:55:11.0513125Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0513423Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0573410Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:55:11.0609100Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0609568Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0693483Z Entering 'third_party/pocketfft' 2025-09-07T07:55:11.0727579Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0728101Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0778895Z Entering 'third_party/protobuf' 2025-09-07T07:55:11.0813331Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0813843Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0860044Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:55:11.0892944Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0893371Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0935306Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:55:11.0963482Z url.https://github.com/.insteadof 2025-09-07T07:55:11.0963973Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1002621Z Entering 'third_party/psimd' 2025-09-07T07:55:11.1036641Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1037143Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1079460Z Entering 'third_party/pthreadpool' 2025-09-07T07:55:11.1108422Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1108908Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1154209Z Entering 'third_party/pybind11' 2025-09-07T07:55:11.1185651Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1186122Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1235616Z Entering 'third_party/python-peachpy' 2025-09-07T07:55:11.1271988Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1272348Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1315328Z Entering 'third_party/sleef' 2025-09-07T07:55:11.1348692Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1349157Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1396531Z Entering 'third_party/tensorpipe' 2025-09-07T07:55:11.1432045Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1432713Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1480160Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:55:11.1512545Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1512985Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1551010Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:55:11.1579074Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1579548Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1626919Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:55:11.1662947Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1663383Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1707787Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:55:11.1741586Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1742087Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1782683Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:55:11.1815555Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1816051Z url.https://github.com/.insteadof 2025-09-07T07:55:11.1896094Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-09-07T07:55:11.2186097Z Entering 'android/libs/fbjni' 2025-09-07T07:55:11.2228981Z file:/home/david/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-09-07T07:55:11.2248927Z Entering 'third_party/FP16' 2025-09-07T07:55:11.2288456Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-09-07T07:55:11.2308002Z Entering 'third_party/FXdiv' 2025-09-07T07:55:11.2347436Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-09-07T07:55:11.2366674Z Entering 'third_party/NNPACK' 2025-09-07T07:55:11.2405977Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-09-07T07:55:11.2425656Z Entering 'third_party/NVTX' 2025-09-07T07:55:11.2464424Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-09-07T07:55:11.2492340Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:55:11.2538355Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-09-07T07:55:11.2563555Z Entering 'third_party/XNNPACK' 2025-09-07T07:55:11.2609759Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-09-07T07:55:11.2646797Z Entering 'third_party/aiter' 2025-09-07T07:55:11.2688686Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-09-07T07:55:11.2710388Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:55:11.2749895Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-09-07T07:55:11.2777810Z Entering 'third_party/benchmark' 2025-09-07T07:55:11.2817704Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:55:11.2839196Z Entering 'third_party/composable_kernel' 2025-09-07T07:55:11.2878852Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-09-07T07:55:11.2914471Z Entering 'third_party/cpp-httplib' 2025-09-07T07:55:11.2961012Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-09-07T07:55:11.2982283Z Entering 'third_party/cpuinfo' 2025-09-07T07:55:11.3024608Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-09-07T07:55:11.3045651Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:55:11.3088314Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-09-07T07:55:11.3110455Z Entering 'third_party/cutlass' 2025-09-07T07:55:11.3157808Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-09-07T07:55:11.3188429Z Entering 'third_party/fbgemm' 2025-09-07T07:55:11.3232932Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-09-07T07:55:11.3256514Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:55:11.3296372Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-09-07T07:55:11.3316118Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:55:11.3358827Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-09-07T07:55:11.3387946Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:55:11.3430384Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-09-07T07:55:11.3451600Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:55:11.3493363Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-09-07T07:55:11.3527384Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:55:11.3568484Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-09-07T07:55:11.3588864Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:55:11.3627587Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-09-07T07:55:11.3646259Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:55:11.3686029Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-09-07T07:55:11.3708738Z Entering 'third_party/flash-attention' 2025-09-07T07:55:11.3750949Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-09-07T07:55:11.3771483Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:55:11.3809715Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-09-07T07:55:11.3834698Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:55:11.3872794Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-09-07T07:55:11.3901527Z Entering 'third_party/flatbuffers' 2025-09-07T07:55:11.3940699Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-09-07T07:55:11.3964708Z Entering 'third_party/fmt' 2025-09-07T07:55:11.4003866Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-09-07T07:55:11.4025089Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:55:11.4064276Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-09-07T07:55:11.4085056Z Entering 'third_party/gloo' 2025-09-07T07:55:11.4124573Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-09-07T07:55:11.4145561Z Entering 'third_party/googletest' 2025-09-07T07:55:11.4186439Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:55:11.4207204Z Entering 'third_party/ideep' 2025-09-07T07:55:11.4258108Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-09-07T07:55:11.4279502Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:55:11.4319462Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-09-07T07:55:11.4348896Z Entering 'third_party/ittapi' 2025-09-07T07:55:11.4389683Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-09-07T07:55:11.4410835Z Entering 'third_party/kineto' 2025-09-07T07:55:11.4450844Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-09-07T07:55:11.4471862Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:55:11.4511966Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-09-07T07:55:11.4531503Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:55:11.4575239Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-09-07T07:55:11.4597760Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:55:11.4639065Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-09-07T07:55:11.4660236Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:55:11.4700908Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-09-07T07:55:11.4726611Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:55:11.4769012Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-09-07T07:55:11.4788446Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:55:11.4829931Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-09-07T07:55:11.4854349Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:55:11.4895597Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-09-07T07:55:11.4916356Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:55:11.4957796Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:55:11.4978611Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:55:11.5019476Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-09-07T07:55:11.5041300Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:55:11.5081709Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-09-07T07:55:11.5106976Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:55:11.5147653Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-09-07T07:55:11.5167815Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:55:11.5208761Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-09-07T07:55:11.5232154Z Entering 'third_party/kleidiai' 2025-09-07T07:55:11.5273949Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-09-07T07:55:11.5295778Z Entering 'third_party/mimalloc' 2025-09-07T07:55:11.5336808Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-09-07T07:55:11.5360177Z Entering 'third_party/nlohmann' 2025-09-07T07:55:11.5400578Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-09-07T07:55:11.5423428Z Entering 'third_party/onnx' 2025-09-07T07:55:11.5464682Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-09-07T07:55:11.5501554Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:55:11.5543823Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:55:11.5569979Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:55:11.5611968Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-09-07T07:55:11.5634257Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:55:11.5674593Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:55:11.5696385Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:55:11.5738637Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:55:11.5759902Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:55:11.5800592Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-09-07T07:55:11.5821864Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:55:11.5863441Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-09-07T07:55:11.5886121Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:55:11.5929137Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-09-07T07:55:11.5949792Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:55:11.5990920Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-09-07T07:55:11.6012056Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:55:11.6054416Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-09-07T07:55:11.6073964Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:55:11.6115633Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-09-07T07:55:11.6137788Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:55:11.6177819Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-09-07T07:55:11.6201022Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:55:11.6244459Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-09-07T07:55:11.6286535Z Entering 'third_party/pocketfft' 2025-09-07T07:55:11.6328327Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-09-07T07:55:11.6351014Z Entering 'third_party/protobuf' 2025-09-07T07:55:11.6392860Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-09-07T07:55:11.6417759Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:55:11.6459244Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-09-07T07:55:11.6479535Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:55:11.6521217Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:55:11.6543629Z Entering 'third_party/psimd' 2025-09-07T07:55:11.6586687Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-09-07T07:55:11.6607547Z Entering 'third_party/pthreadpool' 2025-09-07T07:55:11.6650196Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-09-07T07:55:11.6670362Z Entering 'third_party/pybind11' 2025-09-07T07:55:11.6711445Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:55:11.6732327Z Entering 'third_party/python-peachpy' 2025-09-07T07:55:11.6771529Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-09-07T07:55:11.6791890Z Entering 'third_party/sleef' 2025-09-07T07:55:11.6835712Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-09-07T07:55:11.6856319Z Entering 'third_party/tensorpipe' 2025-09-07T07:55:11.6895877Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-09-07T07:55:11.6915336Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:55:11.6955372Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-09-07T07:55:11.6975388Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:55:11.7012831Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-09-07T07:55:11.7036783Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:55:11.7083033Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-09-07T07:55:11.7106490Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:55:11.7147148Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-09-07T07:55:11.7168880Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:55:11.7212814Z file:/home/david/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-09-07T07:55:11.7505311Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-09-07T07:55:11.7774472Z Entering 'android/libs/fbjni' 2025-09-07T07:55:11.7827726Z Entering 'third_party/FP16' 2025-09-07T07:55:11.7872342Z Entering 'third_party/FXdiv' 2025-09-07T07:55:11.7919669Z Entering 'third_party/NNPACK' 2025-09-07T07:55:11.7964077Z Entering 'third_party/NVTX' 2025-09-07T07:55:11.8008685Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:55:11.8065364Z Entering 'third_party/XNNPACK' 2025-09-07T07:55:11.8131917Z Entering 'third_party/aiter' 2025-09-07T07:55:11.8183196Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:55:11.8242582Z Entering 'third_party/benchmark' 2025-09-07T07:55:11.8294617Z Entering 'third_party/composable_kernel' 2025-09-07T07:55:11.8355954Z Entering 'third_party/cpp-httplib' 2025-09-07T07:55:11.8403539Z Entering 'third_party/cpuinfo' 2025-09-07T07:55:11.8448892Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:55:11.8500437Z Entering 'third_party/cutlass' 2025-09-07T07:55:11.8557463Z Entering 'third_party/fbgemm' 2025-09-07T07:55:11.8605448Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:55:11.8655776Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:55:11.8711145Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:55:11.8760346Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:55:11.8815356Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:55:11.8864702Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:55:11.8910067Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:55:11.8960793Z Entering 'third_party/flash-attention' 2025-09-07T07:55:11.9007416Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:55:11.9053460Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:55:11.9104126Z Entering 'third_party/flatbuffers' 2025-09-07T07:55:11.9154307Z Entering 'third_party/fmt' 2025-09-07T07:55:11.9210900Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:55:11.9269710Z Entering 'third_party/gloo' 2025-09-07T07:55:11.9322234Z Entering 'third_party/googletest' 2025-09-07T07:55:11.9373061Z Entering 'third_party/ideep' 2025-09-07T07:55:11.9420889Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:55:11.9477253Z Entering 'third_party/ittapi' 2025-09-07T07:55:11.9527345Z Entering 'third_party/kineto' 2025-09-07T07:55:11.9573272Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:55:11.9621351Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:55:11.9673010Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:55:11.9720638Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:55:11.9765704Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:55:11.9805424Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:55:11.9849803Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:55:11.9889995Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:55:11.9934727Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:55:11.9976168Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:55:12.0018932Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:55:12.0068143Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:55:12.0116855Z Entering 'third_party/kleidiai' 2025-09-07T07:55:12.0168604Z Entering 'third_party/mimalloc' 2025-09-07T07:55:12.0216267Z Entering 'third_party/nlohmann' 2025-09-07T07:55:12.0264192Z Entering 'third_party/onnx' 2025-09-07T07:55:12.0325244Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:55:12.0376553Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:55:12.0423015Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:55:12.0461449Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:55:12.0505368Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:55:12.0546357Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:55:12.0584874Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:55:12.0623159Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:55:12.0660909Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:55:12.0698057Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:55:12.0739443Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:55:12.0781011Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:55:12.0839423Z Entering 'third_party/pocketfft' 2025-09-07T07:55:12.0882751Z Entering 'third_party/protobuf' 2025-09-07T07:55:12.0927779Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:55:12.0973181Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:55:12.1017011Z Entering 'third_party/psimd' 2025-09-07T07:55:12.1063138Z Entering 'third_party/pthreadpool' 2025-09-07T07:55:12.1110178Z Entering 'third_party/pybind11' 2025-09-07T07:55:12.1155928Z Entering 'third_party/python-peachpy' 2025-09-07T07:55:12.1206474Z Entering 'third_party/sleef' 2025-09-07T07:55:12.1247868Z Entering 'third_party/tensorpipe' 2025-09-07T07:55:12.1287829Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:55:12.1327776Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:55:12.1370745Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:55:12.1415455Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:55:12.1456600Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:55:12.1526572Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-09-07T07:55:12.1794304Z Entering 'android/libs/fbjni' 2025-09-07T07:55:12.1840908Z Entering 'third_party/FP16' 2025-09-07T07:55:12.1883266Z Entering 'third_party/FXdiv' 2025-09-07T07:55:12.1925144Z Entering 'third_party/NNPACK' 2025-09-07T07:55:12.1964592Z Entering 'third_party/NVTX' 2025-09-07T07:55:12.2004270Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T07:55:12.2043248Z Entering 'third_party/XNNPACK' 2025-09-07T07:55:12.2096692Z Entering 'third_party/aiter' 2025-09-07T07:55:12.2143188Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T07:55:12.2192738Z Entering 'third_party/benchmark' 2025-09-07T07:55:12.2234816Z Entering 'third_party/composable_kernel' 2025-09-07T07:55:12.2285251Z Entering 'third_party/cpp-httplib' 2025-09-07T07:55:12.2326559Z Entering 'third_party/cpuinfo' 2025-09-07T07:55:12.2368241Z Entering 'third_party/cudnn_frontend' 2025-09-07T07:55:12.2411277Z Entering 'third_party/cutlass' 2025-09-07T07:55:12.2464380Z Entering 'third_party/fbgemm' 2025-09-07T07:55:12.2508506Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T07:55:12.2548781Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T07:55:12.2594196Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T07:55:12.2633045Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T07:55:12.2678951Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T07:55:12.2716744Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T07:55:12.2755601Z Entering 'third_party/fbgemm/external/json' 2025-09-07T07:55:12.2797541Z Entering 'third_party/flash-attention' 2025-09-07T07:55:12.2838819Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T07:55:12.2883489Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T07:55:12.2937867Z Entering 'third_party/flatbuffers' 2025-09-07T07:55:12.2986344Z Entering 'third_party/fmt' 2025-09-07T07:55:12.3030455Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T07:55:12.3075482Z Entering 'third_party/gloo' 2025-09-07T07:55:12.3119336Z Entering 'third_party/googletest' 2025-09-07T07:55:12.3163611Z Entering 'third_party/ideep' 2025-09-07T07:55:12.3205808Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T07:55:12.3256980Z Entering 'third_party/ittapi' 2025-09-07T07:55:12.3300949Z Entering 'third_party/kineto' 2025-09-07T07:55:12.3343823Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T07:55:12.3386049Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T07:55:12.3430394Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T07:55:12.3472293Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T07:55:12.3516173Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T07:55:12.3557666Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T07:55:12.3603187Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T07:55:12.3646184Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T07:55:12.3687602Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T07:55:12.3731821Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T07:55:12.3776311Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T07:55:12.3817747Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T07:55:12.3862886Z Entering 'third_party/kleidiai' 2025-09-07T07:55:12.3907517Z Entering 'third_party/mimalloc' 2025-09-07T07:55:12.3951080Z Entering 'third_party/nlohmann' 2025-09-07T07:55:12.3996440Z Entering 'third_party/onnx' 2025-09-07T07:55:12.4054580Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T07:55:12.4102948Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T07:55:12.4147807Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T07:55:12.4188829Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T07:55:12.4230661Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T07:55:12.4271778Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T07:55:12.4314610Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T07:55:12.4356025Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T07:55:12.4397869Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T07:55:12.4437947Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T07:55:12.4482880Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T07:55:12.4527555Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T07:55:12.4589948Z Entering 'third_party/pocketfft' 2025-09-07T07:55:12.4633783Z Entering 'third_party/protobuf' 2025-09-07T07:55:12.4679963Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T07:55:12.4721362Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T07:55:12.4767162Z Entering 'third_party/psimd' 2025-09-07T07:55:12.4811271Z Entering 'third_party/pthreadpool' 2025-09-07T07:55:12.4854115Z Entering 'third_party/pybind11' 2025-09-07T07:55:12.4897222Z Entering 'third_party/python-peachpy' 2025-09-07T07:55:12.4940863Z Entering 'third_party/sleef' 2025-09-07T07:55:12.4984097Z Entering 'third_party/tensorpipe' 2025-09-07T07:55:12.5027911Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T07:55:12.5068923Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T07:55:12.5110397Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T07:55:12.5151525Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T07:55:12.5191832Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T07:55:12.5253460Z ##[endgroup] 2025-09-07T07:55:12.5300316Z [command]/usr/bin/git log -1 --format=%H 2025-09-07T07:55:12.5325892Z 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:55:12.5514391Z Prepare all required actions 2025-09-07T07:55:12.5514914Z Getting action download info 2025-09-07T07:55:12.7207760Z ##[group]Run ./.github/actions/setup-linux 2025-09-07T07:55:12.7208008Z env: 2025-09-07T07:55:12.7208175Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:12.7208362Z ##[endgroup] 2025-09-07T07:55:12.7243705Z ##[group]Run set -euo pipefail 2025-09-07T07:55:12.7244007Z set -euo pipefail 2025-09-07T07:55:12.7244233Z function get_ec2_metadata() { 2025-09-07T07:55:12.7244766Z  # Pulled from instance metadata endpoint for EC2 2025-09-07T07:55:12.7245421Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2025-09-07T07:55:12.7245828Z  category=$1 2025-09-07T07:55:12.7246095Z  # If it is GCP runner (runner name contains gcp), do not run this 2025-09-07T07:55:12.7246410Z  runner_name_str=i-0d73070610f53945f-1004 2025-09-07T07:55:12.7246699Z  if [[ -f /.inarc ]]; then 2025-09-07T07:55:12.7246952Z  echo "ARC Runner, no info on ec2 metadata" 2025-09-07T07:55:12.7247227Z  elif [[ $runner_name_str == *"gcp"* ]]; then 2025-09-07T07:55:12.7247564Z  echo "Runner is from Google Cloud Platform, No info on ec2 metadata" 2025-09-07T07:55:12.7247867Z  else 2025-09-07T07:55:12.7248478Z  curl -H "X-aws-ec2-metadata-token: $(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 30")" -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2025-09-07T07:55:12.7249115Z  fi 2025-09-07T07:55:12.7249286Z } 2025-09-07T07:55:12.7249479Z echo "ami-id: $(get_ec2_metadata ami-id)" 2025-09-07T07:55:12.7249783Z echo "instance-id: $(get_ec2_metadata instance-id)" 2025-09-07T07:55:12.7250122Z echo "instance-type: $(get_ec2_metadata instance-type)" 2025-09-07T07:55:12.7250418Z echo "system info $(uname -a)" 2025-09-07T07:55:12.7265388Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:55:12.7265681Z env: 2025-09-07T07:55:12.7265843Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:12.7266034Z ##[endgroup] 2025-09-07T07:55:12.7305511Z ami-id: ARC Runner, no info on ec2 metadata 2025-09-07T07:55:12.7311535Z instance-id: ARC Runner, no info on ec2 metadata 2025-09-07T07:55:12.7316317Z instance-type: ARC Runner, no info on ec2 metadata 2025-09-07T07:55:12.7327885Z system info Linux 92d046649eb1 6.8.0-1017-aws #18~22.04.1-Ubuntu SMP Thu Oct 3 19:57:42 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux 2025-09-07T07:55:12.7346592Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T07:55:12.7347301Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T07:55:12.7361984Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:55:12.7362278Z env: 2025-09-07T07:55:12.7362434Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:12.7362630Z ##[endgroup] 2025-09-07T07:55:12.7434081Z ##[group]Run nick-fields/retry@v3.0.0 2025-09-07T07:55:12.7434304Z with: 2025-09-07T07:55:12.7434445Z shell: bash 2025-09-07T07:55:12.7434608Z timeout_minutes: 5 2025-09-07T07:55:12.7434787Z max_attempts: 3 2025-09-07T07:55:12.7435139Z retry_wait_seconds: 30 2025-09-07T07:55:12.7437081Z command: AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" # For LF Runners we need to make sure we also login to Meta's ECR docker registry too. META_AWS_ACCOUNT_ID=308535385114 if [ "$AWS_ACCOUNT_ID" != "$META_AWS_ACCOUNT_ID" ] ; then aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$META_AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" fi 2025-09-07T07:55:12.7438702Z polling_interval_seconds: 1 2025-09-07T07:55:12.7438903Z warning_on_retry: true 2025-09-07T07:55:12.7439093Z continue_on_error: false 2025-09-07T07:55:12.7439275Z env: 2025-09-07T07:55:12.7439426Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:12.7439611Z AWS_RETRY_MODE: standard 2025-09-07T07:55:12.7439795Z AWS_MAX_ATTEMPTS: 5 2025-09-07T07:55:12.7439980Z AWS_DEFAULT_REGION: us-east-1 2025-09-07T07:55:12.7440349Z ##[endgroup] 2025-09-07T07:55:14.3310726Z 2025-09-07T07:55:14.3311601Z WARNING! Your credentials are stored unencrypted in '/home/david/.docker/config.json'. 2025-09-07T07:55:14.3312181Z Configure a credential helper to remove this warning. See 2025-09-07T07:55:14.3312580Z https://docs.docker.com/go/credential-store/ 2025-09-07T07:55:14.3312797Z 2025-09-07T07:55:14.3312878Z Login Succeeded 2025-09-07T07:55:14.8164028Z Command completed after 1 attempt(s). 2025-09-07T07:55:14.8238214Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T07:55:14.8238633Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T07:55:14.8239028Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T07:55:14.8254315Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:55:14.8254610Z env: 2025-09-07T07:55:14.8254780Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:14.8255266Z ##[endgroup] 2025-09-07T07:55:14.8346085Z ##[group]Run set +e 2025-09-07T07:55:14.8346332Z set +e 2025-09-07T07:55:14.8346508Z set -x 2025-09-07T07:55:14.8346668Z  2025-09-07T07:55:14.8346842Z PT_DOMAIN=download.pytorch.org 2025-09-07T07:55:14.8347263Z # TODO: Flaky access to download.pytorch.org https://github.com/pytorch/pytorch/issues/100400, 2025-09-07T07:55:14.8347810Z # cleaning this up once the issue is fixed. There are more than one resolved IP here, the last 2025-09-07T07:55:14.8348198Z # one is returned at random 2025-09-07T07:55:14.8348488Z RESOLVED_IP=$(dig -4 +short "${PT_DOMAIN}" | tail -n1) 2025-09-07T07:55:14.8348760Z  2025-09-07T07:55:14.8348930Z if [ -z "${RESOLVED_IP}" ]; then 2025-09-07T07:55:14.8349242Z  echo "Couldn't resolve ${PT_DOMAIN}, retrying with Google DNS..." 2025-09-07T07:55:14.8349635Z  RESOLVED_IP=$(dig -4 +short "${PT_DOMAIN}" @8.8.8.8 | tail -n1) 2025-09-07T07:55:14.8349932Z  2025-09-07T07:55:14.8350100Z  if [ -z "${RESOLVED_IP}" ]; then 2025-09-07T07:55:14.8350379Z  echo "Couldn't resolve ${PT_DOMAIN}, exiting..." 2025-09-07T07:55:14.8350642Z  exit 1 2025-09-07T07:55:14.8350814Z  fi 2025-09-07T07:55:14.8350963Z fi 2025-09-07T07:55:14.8351112Z  2025-09-07T07:55:14.8351299Z if grep -r "${PT_DOMAIN}" /etc/hosts; then 2025-09-07T07:55:14.8351563Z  # Clean up any old records first 2025-09-07T07:55:14.8351818Z  sudo sed -i "/${PT_DOMAIN}/d" /etc/hosts 2025-09-07T07:55:14.8352049Z fi 2025-09-07T07:55:14.8352201Z  2025-09-07T07:55:14.8352425Z echo "${RESOLVED_IP} ${PT_DOMAIN}" | sudo tee -a /etc/hosts 2025-09-07T07:55:14.8352698Z cat /etc/hosts 2025-09-07T07:55:14.8367240Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:55:14.8367533Z env: 2025-09-07T07:55:14.8367699Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:14.8367900Z ##[endgroup] 2025-09-07T07:55:14.8399525Z + PT_DOMAIN=download.pytorch.org 2025-09-07T07:55:14.8406334Z ++ dig -4 +short download.pytorch.org 2025-09-07T07:55:14.8407960Z ++ tail -n1 2025-09-07T07:55:14.8609763Z + RESOLVED_IP=3.170.131.13 2025-09-07T07:55:14.8610046Z + '[' -z 3.170.131.13 ']' 2025-09-07T07:55:14.8610322Z + grep -r download.pytorch.org /etc/hosts 2025-09-07T07:55:14.8624335Z + echo '3.170.131.13 download.pytorch.org' 2025-09-07T07:55:14.8625896Z + sudo tee -a /etc/hosts 2025-09-07T07:55:14.8687265Z 3.170.131.13 download.pytorch.org 2025-09-07T07:55:14.8695695Z + cat /etc/hosts 2025-09-07T07:55:14.8703748Z 127.0.0.1 localhost 2025-09-07T07:55:14.8708896Z ::1 localhost ip6-localhost ip6-loopback 2025-09-07T07:55:14.8709201Z fe00:: ip6-localnet 2025-09-07T07:55:14.8709415Z ff00:: ip6-mcastprefix 2025-09-07T07:55:14.8709628Z ff02::1 ip6-allnodes 2025-09-07T07:55:14.8709838Z ff02::2 ip6-allrouters 2025-09-07T07:55:14.8710060Z 172.17.0.2 92d046649eb1 2025-09-07T07:55:14.8710283Z 3.170.131.13 download.pytorch.org 2025-09-07T07:55:14.8730299Z ##[group]Run set +x 2025-09-07T07:55:14.8730524Z set +x 2025-09-07T07:55:14.8730691Z  2025-09-07T07:55:14.8730847Z max_attempts=30 2025-09-07T07:55:14.8731041Z delay=10 2025-09-07T07:55:14.8731210Z attempt=1 2025-09-07T07:55:14.8731380Z  2025-09-07T07:55:14.8731561Z for attempt in $(seq 1 $max_attempts); do 2025-09-07T07:55:14.8731954Z  echo "Attempt $attempt of $max_attempts: Checking if Docker daemon is running..." 2025-09-07T07:55:14.8732337Z  if docker info > /dev/null 2>&1; then 2025-09-07T07:55:14.8732652Z  echo "Docker is running. Proceeding with the next steps" 2025-09-07T07:55:14.8732932Z  exit 0 2025-09-07T07:55:14.8733096Z  else 2025-09-07T07:55:14.8733285Z  echo "Docker is not running yet." 2025-09-07T07:55:14.8733544Z  echo "Retrying in $delay seconds..." 2025-09-07T07:55:14.8733789Z  sleep $delay 2025-09-07T07:55:14.8733967Z  fi 2025-09-07T07:55:14.8734120Z done 2025-09-07T07:55:14.8734366Z echo "Reached maximum attempts to connect to Docker. Exiting." 2025-09-07T07:55:14.8734657Z exit 1 2025-09-07T07:55:14.8748943Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:55:14.8749219Z env: 2025-09-07T07:55:14.8749380Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:14.8749570Z ##[endgroup] 2025-09-07T07:55:14.8793178Z Attempt 1 of 30: Checking if Docker daemon is running... 2025-09-07T07:55:14.9251786Z Docker is running. Proceeding with the next steps 2025-09-07T07:55:14.9400385Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2025-09-07T07:55:14.9400730Z with: 2025-09-07T07:55:14.9401371Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:14.9402083Z use-custom-docker-registry: true 2025-09-07T07:55:14.9402307Z docker-build-dir: .ci/docker 2025-09-07T07:55:14.9402517Z docker-build-script: ./build.sh 2025-09-07T07:55:14.9402722Z working-directory: . 2025-09-07T07:55:14.9402969Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:55:14.9403249Z force-push: false 2025-09-07T07:55:14.9403410Z env: 2025-09-07T07:55:14.9403556Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:14.9403744Z ##[endgroup] 2025-09-07T07:55:14.9420165Z ##[group]Run set -ex 2025-09-07T07:55:14.9420395Z set -ex 2025-09-07T07:55:14.9420576Z  2025-09-07T07:55:14.9420898Z # If the docker build directory or the build script doesn't exist, the action will 2025-09-07T07:55:14.9421475Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2025-09-07T07:55:14.9421893Z # job could then download the pre-built image as usual 2025-09-07T07:55:14.9422378Z if [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then 2025-09-07T07:55:14.9422846Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2025-09-07T07:55:14.9423090Z else 2025-09-07T07:55:14.9423282Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2025-09-07T07:55:14.9423592Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:55:14.9423876Z  2025-09-07T07:55:14.9424280Z  echo "Not using custom ECR registry. Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..." 2025-09-07T07:55:14.9424728Z  exit 0 2025-09-07T07:55:14.9424883Z fi 2025-09-07T07:55:14.9425198Z  2025-09-07T07:55:14.9425451Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2025-09-07T07:55:14.9425876Z  # The docker image name already includes the ECR prefix and tag, so we can just 2025-09-07T07:55:14.9426547Z  # use it as it is, but first let's extract the tag 2025-09-07T07:55:14.9426884Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2025-09-07T07:55:14.9427244Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:55:14.9427587Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:55:14.9427869Z else 2025-09-07T07:55:14.9428062Z  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then 2025-09-07T07:55:14.9428327Z  CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:} 2025-09-07T07:55:14.9428608Z  DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*} 2025-09-07T07:55:14.9428846Z  fi 2025-09-07T07:55:14.9429177Z  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2025-09-07T07:55:14.9429600Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:55:14.9430073Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:55:14.9430559Z  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}" 2025-09-07T07:55:14.9430860Z fi 2025-09-07T07:55:14.9443835Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:55:14.9444120Z env: 2025-09-07T07:55:14.9444279Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:14.9444474Z REPO_NAME: pytorch 2025-09-07T07:55:14.9445533Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:14.9446234Z DOCKER_BUILD_DIR: .ci/docker 2025-09-07T07:55:14.9446451Z DOCKER_BUILD_SCRIPT: ./build.sh 2025-09-07T07:55:14.9446726Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:55:14.9447014Z USE_CUSTOM_DOCKER_REGISTRY: true 2025-09-07T07:55:14.9447231Z CUSTOM_TAG_PREFIX: 2025-09-07T07:55:14.9447401Z ##[endgroup] 2025-09-07T07:55:14.9477356Z + [[ -d .ci/docker ]] 2025-09-07T07:55:14.9477605Z + [[ -f .ci/docker/./build.sh ]] 2025-09-07T07:55:14.9477855Z + [[ true == \t\r\u\e ]] 2025-09-07T07:55:14.9478072Z + echo skip=false 2025-09-07T07:55:14.9479127Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2025-09-07T07:55:14.9485551Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:14.9486445Z ++ awk -F '[:,]' '{print $2}' 2025-09-07T07:55:14.9499637Z + DOCKER_TAG=pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:14.9500553Z + echo docker-tag=pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:14.9501773Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:14.9521672Z ##[group]Run set +e 2025-09-07T07:55:14.9521903Z set +e 2025-09-07T07:55:14.9522064Z set -x 2025-09-07T07:55:14.9522217Z  2025-09-07T07:55:14.9522368Z login() { 2025-09-07T07:55:14.9522732Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-09-07T07:55:14.9523104Z } 2025-09-07T07:55:14.9523247Z  2025-09-07T07:55:14.9523391Z retry () { 2025-09-07T07:55:14.9523582Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-09-07T07:55:14.9523802Z } 2025-09-07T07:55:14.9524119Z  2025-09-07T07:55:14.9524281Z retry login "${DOCKER_REGISTRY}" 2025-09-07T07:55:14.9524495Z  2025-09-07T07:55:14.9524646Z START_TIME=$(date +%s) 2025-09-07T07:55:14.9524851Z # Wait up to 120 minutes 2025-09-07T07:55:14.9525294Z while [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do 2025-09-07T07:55:14.9525658Z  # Check if image already exists, if it does then skip building it 2025-09-07T07:55:14.9526009Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2025-09-07T07:55:14.9526259Z  exit 0 2025-09-07T07:55:14.9526423Z  fi 2025-09-07T07:55:14.9526576Z  2025-09-07T07:55:14.9526845Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2025-09-07T07:55:14.9527399Z  # use this to differentiate between the Docker build and regular build jobs. For the 2025-09-07T07:55:14.9527858Z  # latter, it will wait for the Docker images to become available before continuing 2025-09-07T07:55:14.9528231Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2025-09-07T07:55:14.9528514Z  # It's a Docker build job, let's build the image 2025-09-07T07:55:14.9528754Z  break 2025-09-07T07:55:14.9528911Z  else 2025-09-07T07:55:14.9529150Z  # It's a regular build job, wait for the image to become available 2025-09-07T07:55:14.9529456Z  sleep 300 2025-09-07T07:55:14.9529637Z  fi 2025-09-07T07:55:14.9529783Z done 2025-09-07T07:55:14.9529929Z  2025-09-07T07:55:14.9530359Z # NB: This part requires a full checkout. Otherwise, the merge base will 2025-09-07T07:55:14.9530765Z # be empty. The default action would be to continue rebuild the image 2025-09-07T07:55:14.9531133Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2025-09-07T07:55:14.9531450Z  # if we're on the base branch then use the parent commit 2025-09-07T07:55:14.9531740Z  MERGE_BASE=$(git rev-parse HEAD~) 2025-09-07T07:55:14.9531962Z else 2025-09-07T07:55:14.9532189Z  # otherwise we're on a PR, so use the most recent base commit 2025-09-07T07:55:14.9532515Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2025-09-07T07:55:14.9532767Z fi 2025-09-07T07:55:14.9532914Z  2025-09-07T07:55:14.9533092Z if [[ -z "${MERGE_BASE}" ]]; then 2025-09-07T07:55:14.9533340Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-09-07T07:55:14.9533577Z  2025-09-07T07:55:14.9533910Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2025-09-07T07:55:14.9534292Z  exit 0 2025-09-07T07:55:14.9534451Z fi 2025-09-07T07:55:14.9534592Z  2025-09-07T07:55:14.9534806Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2025-09-07T07:55:14.9535437Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2025-09-07T07:55:14.9535842Z  exit 1 2025-09-07T07:55:14.9535992Z fi 2025-09-07T07:55:14.9536134Z  2025-09-07T07:55:14.9536384Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2025-09-07T07:55:14.9536840Z # If no image exists but the hash is the same as the previous hash then we should error out here 2025-09-07T07:55:14.9537245Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2025-09-07T07:55:14.9537713Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2025-09-07T07:55:14.9538245Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2025-09-07T07:55:14.9538559Z fi 2025-09-07T07:55:14.9538706Z  2025-09-07T07:55:14.9538879Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-09-07T07:55:14.9552277Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:55:14.9552554Z env: 2025-09-07T07:55:14.9552711Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:14.9552912Z DOCKER_BUILD_DIR: .ci/docker 2025-09-07T07:55:14.9553161Z BASE_REVISION: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T07:55:14.9553899Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:14.9554858Z DOCKER_TAG: pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:14.9555586Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:55:14.9555862Z DOCKER_PUSH: 2025-09-07T07:55:14.9556023Z ##[endgroup] 2025-09-07T07:55:14.9585362Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:55:14.9585705Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:55:14.9588031Z + aws ecr get-login-password --region us-east-1 2025-09-07T07:55:14.9592228Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:55:15.7473940Z 2025-09-07T07:55:15.7474260Z WARNING! Your credentials are stored unencrypted in '/home/david/.docker/config.json'. 2025-09-07T07:55:15.7474794Z Configure a credential helper to remove this warning. See 2025-09-07T07:55:15.7475365Z https://docs.docker.com/go/credential-store/ 2025-09-07T07:55:15.7475582Z 2025-09-07T07:55:15.7475675Z Login Succeeded 2025-09-07T07:55:15.7497609Z ++ date +%s 2025-09-07T07:55:15.7509009Z + START_TIME=1757231715 2025-09-07T07:55:15.7512213Z ++ date +%s 2025-09-07T07:55:15.7521169Z + [[ 1757224515 -lt 1757231715 ]] 2025-09-07T07:55:15.7522095Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:16.1493478Z { 2025-09-07T07:55:16.1493796Z "schemaVersion": 2, 2025-09-07T07:55:16.1494239Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2025-09-07T07:55:16.1494691Z "config": { 2025-09-07T07:55:16.1495493Z "mediaType": "application/vnd.docker.container.image.v1+json", 2025-09-07T07:55:16.1495882Z "size": 31375, 2025-09-07T07:55:16.1496254Z "digest": "sha256:29d1d8a31b215537637bab7c99e18c255840b899cf7023e4e3cb5efa3270aef8" 2025-09-07T07:55:16.1496660Z }, 2025-09-07T07:55:16.1496833Z "layers": [ 2025-09-07T07:55:16.1497029Z { 2025-09-07T07:55:16.1497352Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1497716Z "size": 30448359, 2025-09-07T07:55:16.1498089Z "digest": "sha256:e6fdc8487bfe6d764301ef3634bc6c043841dc3ab05ca14f81e69c0f92562d46" 2025-09-07T07:55:16.1498504Z }, 2025-09-07T07:55:16.1498672Z { 2025-09-07T07:55:16.1498950Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1499323Z "size": 1554, 2025-09-07T07:55:16.1499690Z "digest": "sha256:171dcef20c49de4bc9268f60e02f111b72c638b0f24c3c5636c5013029db6d30" 2025-09-07T07:55:16.1500194Z }, 2025-09-07T07:55:16.1500361Z { 2025-09-07T07:55:16.1500647Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1501006Z "size": 313297922, 2025-09-07T07:55:16.1501485Z "digest": "sha256:4c92b3f72f1df31fe9f487fc1c27fcf1ba475ffb43abd69056306d1247786e40" 2025-09-07T07:55:16.1501895Z }, 2025-09-07T07:55:16.1502057Z { 2025-09-07T07:55:16.1502338Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1502691Z "size": 792, 2025-09-07T07:55:16.1503030Z "digest": "sha256:744f9ba90a6582eb601b3c20409bb10d6dad635dd118c3975f79721f4c82747c" 2025-09-07T07:55:16.1503432Z }, 2025-09-07T07:55:16.1503597Z { 2025-09-07T07:55:16.1503865Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1504673Z "size": 106, 2025-09-07T07:55:16.1505220Z "digest": "sha256:d3c08322a3326e45849dd80264a047c4f42ba4a2419d35c919542e2890e23934" 2025-09-07T07:55:16.1505620Z }, 2025-09-07T07:55:16.1505780Z { 2025-09-07T07:55:16.1506154Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1506532Z "size": 704, 2025-09-07T07:55:16.1506884Z "digest": "sha256:ffd43b71f3ccf3ba563606231cb1d191eb9dd0052f422d54835e6af350525170" 2025-09-07T07:55:16.1507294Z }, 2025-09-07T07:55:16.1507451Z { 2025-09-07T07:55:16.1507721Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1508081Z "size": 1215, 2025-09-07T07:55:16.1508439Z "digest": "sha256:830692b57f6e2758398ec80c3b67a20441d12696b54ed14f2ecebf926198f7d6" 2025-09-07T07:55:16.1508837Z }, 2025-09-07T07:55:16.1508994Z { 2025-09-07T07:55:16.1509264Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1509615Z "size": 482, 2025-09-07T07:55:16.1509952Z "digest": "sha256:5bad36d184686719399be50830a98939d7dbda2313fb407df5915217483fc6a3" 2025-09-07T07:55:16.1510346Z }, 2025-09-07T07:55:16.1510509Z { 2025-09-07T07:55:16.1510779Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1511135Z "size": 110343614, 2025-09-07T07:55:16.1511429Z "digest": "sha256:0e34fdd9ac5c39eb0a9d2c2d258b26f42bb79d7dc0a22014bf201daa2e033eb4" 2025-09-07T07:55:16.1511759Z }, 2025-09-07T07:55:16.1511894Z { 2025-09-07T07:55:16.1512111Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1512392Z "size": 4786, 2025-09-07T07:55:16.1512938Z "digest": "sha256:3c868a62868ef54f82ac11be8dabe1b4365d000bacfe4c104e08022fc96dd767" 2025-09-07T07:55:16.1513285Z }, 2025-09-07T07:55:16.1513422Z { 2025-09-07T07:55:16.1513633Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1513915Z "size": 1710, 2025-09-07T07:55:16.1514192Z "digest": "sha256:62170a22dd571d55ffccac64c0be17f4006d2498cfbf7c6289325f0899cba005" 2025-09-07T07:55:16.1514520Z }, 2025-09-07T07:55:16.1514648Z { 2025-09-07T07:55:16.1514868Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1515359Z "size": 724, 2025-09-07T07:55:16.1515637Z "digest": "sha256:553c1d23b6c4dbd8ab136d0c3659460391ffa14cb9b43be9d7b2f47f90895697" 2025-09-07T07:55:16.1515969Z }, 2025-09-07T07:55:16.1516102Z { 2025-09-07T07:55:16.1516320Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1516602Z "size": 543, 2025-09-07T07:55:16.1516878Z "digest": "sha256:9408d557a804a7dce00897e03ce9f4f447281eb38ce4bc331098a1f1a5ff0d30" 2025-09-07T07:55:16.1517199Z }, 2025-09-07T07:55:16.1517331Z { 2025-09-07T07:55:16.1517549Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1517825Z "size": 3241148049, 2025-09-07T07:55:16.1518122Z "digest": "sha256:df607cfc7c07db6d442e0274e2be8cdc507df8716717363aa92f2fea069bdd9a" 2025-09-07T07:55:16.1518452Z }, 2025-09-07T07:55:16.1518587Z { 2025-09-07T07:55:16.1518800Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1519086Z "size": 32, 2025-09-07T07:55:16.1519368Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:55:16.1519693Z }, 2025-09-07T07:55:16.1519821Z { 2025-09-07T07:55:16.1520048Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1520328Z "size": 380, 2025-09-07T07:55:16.1520618Z "digest": "sha256:40a8e39faeda9f5273ff5014b2ef7d1ffeeef321de234186a705b1e0574326d2" 2025-09-07T07:55:16.1520939Z }, 2025-09-07T07:55:16.1521079Z { 2025-09-07T07:55:16.1521303Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1521590Z "size": 53548049, 2025-09-07T07:55:16.1521875Z "digest": "sha256:d895771c9faca390d7270f8c9c832b1428128c31ba6760b837d64b7e5920373f" 2025-09-07T07:55:16.1522435Z }, 2025-09-07T07:55:16.1522574Z { 2025-09-07T07:55:16.1522794Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1523073Z "size": 232, 2025-09-07T07:55:16.1523354Z "digest": "sha256:c4ee04f39d49efb46e52443e60c7f41832ea708d9bc5bf76c6d740895c66f57a" 2025-09-07T07:55:16.1523681Z }, 2025-09-07T07:55:16.1523816Z { 2025-09-07T07:55:16.1524034Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1524319Z "size": 3403403, 2025-09-07T07:55:16.1524601Z "digest": "sha256:3690c9826e48ed74e21e494d9d78990902abbc68795d002260ce71bff9a2cb3b" 2025-09-07T07:55:16.1524919Z }, 2025-09-07T07:55:16.1525214Z { 2025-09-07T07:55:16.1525434Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1525726Z "size": 1478, 2025-09-07T07:55:16.1526009Z "digest": "sha256:57cbc5013733eedfdf176b6db4b44458e826e1f64c0ef38849e9d77addc88936" 2025-09-07T07:55:16.1526323Z }, 2025-09-07T07:55:16.1526461Z { 2025-09-07T07:55:16.1526680Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1526957Z "size": 482, 2025-09-07T07:55:16.1527227Z "digest": "sha256:f5f4b06b58bbe4201d8b2eb5b0c6c1299f2725dd59e71cc45ef76ad89bba4deb" 2025-09-07T07:55:16.1527549Z }, 2025-09-07T07:55:16.1527680Z { 2025-09-07T07:55:16.1527899Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1528174Z "size": 197, 2025-09-07T07:55:16.1528454Z "digest": "sha256:f59713ce4bf491fe1f663d90e3b32d2290a7d8a4a0e8e13301e3bdb10b949f8e" 2025-09-07T07:55:16.1528776Z }, 2025-09-07T07:55:16.1528909Z { 2025-09-07T07:55:16.1529300Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1529585Z "size": 608, 2025-09-07T07:55:16.1529863Z "digest": "sha256:fe0486521517e626cae4fcbd9c83eb3956aad3ab0f833becee187b830891417b" 2025-09-07T07:55:16.1530202Z }, 2025-09-07T07:55:16.1530337Z { 2025-09-07T07:55:16.1530558Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1530848Z "size": 7874747615, 2025-09-07T07:55:16.1531140Z "digest": "sha256:8c21cc3715a2d715295f0299d8d2443262a3ae8defc1921f3226a0a24fc9c8fe" 2025-09-07T07:55:16.1531455Z }, 2025-09-07T07:55:16.1531590Z { 2025-09-07T07:55:16.1531807Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1532087Z "size": 829, 2025-09-07T07:55:16.1532359Z "digest": "sha256:d37c58456a6a4aa45d78abdb95553b3de0c79d941e18dc757c2c39fd59819739" 2025-09-07T07:55:16.1532681Z }, 2025-09-07T07:55:16.1532815Z { 2025-09-07T07:55:16.1533038Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1533320Z "size": 36688200, 2025-09-07T07:55:16.1533880Z "digest": "sha256:d042f63abc13891184a9d8e0dcdfae9a0daa140dea919fd319f12dcab5c684eb" 2025-09-07T07:55:16.1534225Z }, 2025-09-07T07:55:16.1534363Z { 2025-09-07T07:55:16.1534580Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1534874Z "size": 104, 2025-09-07T07:55:16.1535356Z "digest": "sha256:621284a9c05a47131a59226f6847b5b76ad211908278c1bdb990029d42259941" 2025-09-07T07:55:16.1535672Z }, 2025-09-07T07:55:16.1535806Z { 2025-09-07T07:55:16.1536023Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1536307Z "size": 1496, 2025-09-07T07:55:16.1536592Z "digest": "sha256:85f605d2dd3a8378567d3d974f0ec4694ef5fd988b25aca5d9aebd7c9b9ff018" 2025-09-07T07:55:16.1536912Z }, 2025-09-07T07:55:16.1537037Z { 2025-09-07T07:55:16.1537258Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1537542Z "size": 454406172, 2025-09-07T07:55:16.1537843Z "digest": "sha256:381b5539e5981dc994e71ab212f50135c32128fe1cc35d78bc386da6dffe1d51" 2025-09-07T07:55:16.1538163Z }, 2025-09-07T07:55:16.1538293Z { 2025-09-07T07:55:16.1538514Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1538972Z "size": 162, 2025-09-07T07:55:16.1539242Z "digest": "sha256:a487c0c800295407a4c7ab88c5b9e891b8b6aab9e35e62994d124369fcd7ba87" 2025-09-07T07:55:16.1539560Z }, 2025-09-07T07:55:16.1539696Z { 2025-09-07T07:55:16.1539913Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1540189Z "size": 346, 2025-09-07T07:55:16.1540472Z "digest": "sha256:48bcb81e256634f4132369d8bac738d9d622b010e5802e5292f565edba9035df" 2025-09-07T07:55:16.1540791Z }, 2025-09-07T07:55:16.1540921Z { 2025-09-07T07:55:16.1541133Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1541532Z "size": 32, 2025-09-07T07:55:16.1541819Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:55:16.1542142Z }, 2025-09-07T07:55:16.1542269Z { 2025-09-07T07:55:16.1542494Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1542783Z "size": 106, 2025-09-07T07:55:16.1543071Z "digest": "sha256:e261928c0043c734790a38fa9ebf1bf8674801fa2f5051c3d2eac04e0f02b743" 2025-09-07T07:55:16.1543388Z }, 2025-09-07T07:55:16.1543523Z { 2025-09-07T07:55:16.1543745Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1544028Z "size": 425, 2025-09-07T07:55:16.1544301Z "digest": "sha256:0fea55428091bc98d5c48986120dd1da50b9b6cbd507408b2cdebdbe455e272e" 2025-09-07T07:55:16.1544625Z }, 2025-09-07T07:55:16.1544758Z { 2025-09-07T07:55:16.1545111Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1545407Z "size": 20224775, 2025-09-07T07:55:16.1545864Z "digest": "sha256:b4291bccbb8428a38187cd286fef7c24bd4863c7872c4d1cf96404ec1a69b321" 2025-09-07T07:55:16.1546198Z }, 2025-09-07T07:55:16.1546332Z { 2025-09-07T07:55:16.1546548Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1546833Z "size": 108, 2025-09-07T07:55:16.1547120Z "digest": "sha256:ddc91b09189afc218499daee92ebc22c6deefb22ee115c52c07627ecbaf7b9d5" 2025-09-07T07:55:16.1547455Z }, 2025-09-07T07:55:16.1547583Z { 2025-09-07T07:55:16.1547802Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1548082Z "size": 640, 2025-09-07T07:55:16.1548356Z "digest": "sha256:7540c74286279d1d6a29cdb51d3421e64860c6af74ca4a95736725c0509791ed" 2025-09-07T07:55:16.1548662Z }, 2025-09-07T07:55:16.1548811Z { 2025-09-07T07:55:16.1549034Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1549320Z "size": 724, 2025-09-07T07:55:16.1549597Z "digest": "sha256:553c1d23b6c4dbd8ab136d0c3659460391ffa14cb9b43be9d7b2f47f90895697" 2025-09-07T07:55:16.1549917Z }, 2025-09-07T07:55:16.1550047Z { 2025-09-07T07:55:16.1550273Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1550553Z "size": 149, 2025-09-07T07:55:16.1550826Z "digest": "sha256:003c4e2598fb39f97ec7734271e034a48a3956a58429c9d06601770c2c40de11" 2025-09-07T07:55:16.1551144Z }, 2025-09-07T07:55:16.1551274Z { 2025-09-07T07:55:16.1551485Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1551767Z "size": 135, 2025-09-07T07:55:16.1552043Z "digest": "sha256:5687149362ae68fa2aa7d4ecd39fbf7ea86c0f6ced36a71f3c59f68f6c465cfc" 2025-09-07T07:55:16.1552367Z }, 2025-09-07T07:55:16.1552493Z { 2025-09-07T07:55:16.1552711Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1552995Z "size": 141, 2025-09-07T07:55:16.1553278Z "digest": "sha256:cdd2cf54eb2a3d8d034aa1556c9724d240b06397ba08f8b13b0bed6d65755aeb" 2025-09-07T07:55:16.1553597Z }, 2025-09-07T07:55:16.1553732Z { 2025-09-07T07:55:16.1553953Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1554241Z "size": 18615922074, 2025-09-07T07:55:16.1554538Z "digest": "sha256:d3ad4df1ba3a86ef1f84c427aae440ff027d483949d48eec4be6135260668cad" 2025-09-07T07:55:16.1555174Z }, 2025-09-07T07:55:16.1555308Z { 2025-09-07T07:55:16.1555528Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1555805Z "size": 223, 2025-09-07T07:55:16.1556080Z "digest": "sha256:3c9055753b4c79d74c707a91d8626ce10bc439129ba10dad3ebc643d9d4955dd" 2025-09-07T07:55:16.1556401Z }, 2025-09-07T07:55:16.1556535Z { 2025-09-07T07:55:16.1556750Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1557034Z "size": 353035275, 2025-09-07T07:55:16.1557330Z "digest": "sha256:31cf8d0bd21c76ae21f73d8b19b30949d161a498354f54191b4e5a294e929701" 2025-09-07T07:55:16.1557654Z }, 2025-09-07T07:55:16.1557795Z { 2025-09-07T07:55:16.1558020Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1558305Z "size": 6523020957, 2025-09-07T07:55:16.1558594Z "digest": "sha256:6623ea81497183b62e034e4ea8df8bf00fa75aaa192eea2821b2dd8655383b8f" 2025-09-07T07:55:16.1558911Z }, 2025-09-07T07:55:16.1559050Z { 2025-09-07T07:55:16.1559278Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1559572Z "size": 129, 2025-09-07T07:55:16.1559849Z "digest": "sha256:11696c3aa3808236d49256bc170b49d55cf657e499592b39b4856f6137220f55" 2025-09-07T07:55:16.1560177Z }, 2025-09-07T07:55:16.1560313Z { 2025-09-07T07:55:16.1560538Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1560819Z "size": 778, 2025-09-07T07:55:16.1561108Z "digest": "sha256:ef4d544e35cacc73a229bcbc7a5510f8b156c7b3041f19f3a274562cd97cfd94" 2025-09-07T07:55:16.1561434Z }, 2025-09-07T07:55:16.1561567Z { 2025-09-07T07:55:16.1561974Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1562269Z "size": 724, 2025-09-07T07:55:16.1562549Z "digest": "sha256:553c1d23b6c4dbd8ab136d0c3659460391ffa14cb9b43be9d7b2f47f90895697" 2025-09-07T07:55:16.1562871Z }, 2025-09-07T07:55:16.1562998Z { 2025-09-07T07:55:16.1563218Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1563520Z "size": 141, 2025-09-07T07:55:16.1563791Z "digest": "sha256:5c5108865e5e293209ae9bae8a29645035242e7e4b4433208a777496fddc988c" 2025-09-07T07:55:16.1564102Z }, 2025-09-07T07:55:16.1564232Z { 2025-09-07T07:55:16.1564450Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1564730Z "size": 32, 2025-09-07T07:55:16.1565144Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:55:16.1565470Z }, 2025-09-07T07:55:16.1565617Z { 2025-09-07T07:55:16.1565848Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1566125Z "size": 159, 2025-09-07T07:55:16.1566396Z "digest": "sha256:9e97578e9edf1a11187740a5aa102633331fb6a714d0ed48683782de5a36fbd8" 2025-09-07T07:55:16.1566714Z }, 2025-09-07T07:55:16.1566853Z { 2025-09-07T07:55:16.1567068Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1567356Z "size": 1012, 2025-09-07T07:55:16.1567642Z "digest": "sha256:da5a91b54cb51f851560992645bc203f2287d9b1d7a4f04f7f4ea7efe45036ce" 2025-09-07T07:55:16.1567965Z }, 2025-09-07T07:55:16.1568093Z { 2025-09-07T07:55:16.1568314Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1568600Z "size": 724, 2025-09-07T07:55:16.1568883Z "digest": "sha256:553c1d23b6c4dbd8ab136d0c3659460391ffa14cb9b43be9d7b2f47f90895697" 2025-09-07T07:55:16.1569201Z }, 2025-09-07T07:55:16.1569336Z { 2025-09-07T07:55:16.1569564Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1569864Z "size": 135, 2025-09-07T07:55:16.1570143Z "digest": "sha256:1e93be219e89e7733b91ba7e3af1a44d985e84959f732ecd5f5ca61bd13b5d41" 2025-09-07T07:55:16.1570477Z }, 2025-09-07T07:55:16.1570611Z { 2025-09-07T07:55:16.1570836Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1571113Z "size": 32, 2025-09-07T07:55:16.1571549Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:55:16.1571882Z }, 2025-09-07T07:55:16.1572032Z { 2025-09-07T07:55:16.1572250Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1572537Z "size": 158, 2025-09-07T07:55:16.1572810Z "digest": "sha256:136825afebb533ee295f0d2523595281086c6410c60d5f712b84cefd24cb31d5" 2025-09-07T07:55:16.1573129Z }, 2025-09-07T07:55:16.1573257Z { 2025-09-07T07:55:16.1573477Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1573759Z "size": 1368, 2025-09-07T07:55:16.1574037Z "digest": "sha256:22b39805302d877e4c1ba433ebc36520438ea29a9ba8bc059efbcd9106f3a82d" 2025-09-07T07:55:16.1574348Z }, 2025-09-07T07:55:16.1574479Z { 2025-09-07T07:55:16.1574697Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1575110Z "size": 32, 2025-09-07T07:55:16.1575385Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:55:16.1575712Z }, 2025-09-07T07:55:16.1575844Z { 2025-09-07T07:55:16.1576060Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1576335Z "size": 136, 2025-09-07T07:55:16.1576623Z "digest": "sha256:d12add675e3505e74eb9880eeef540ea0801282ca1ae01c3c221157cec91f5ae" 2025-09-07T07:55:16.1576947Z }, 2025-09-07T07:55:16.1577079Z { 2025-09-07T07:55:16.1577290Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1577572Z "size": 380, 2025-09-07T07:55:16.1577995Z "digest": "sha256:bc127046d33a7a98563698411b54ece8a167d520922879d7b69e8ca73a12d034" 2025-09-07T07:55:16.1578319Z }, 2025-09-07T07:55:16.1578447Z { 2025-09-07T07:55:16.1578665Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1578946Z "size": 32, 2025-09-07T07:55:16.1579223Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:55:16.1579551Z }, 2025-09-07T07:55:16.1579678Z { 2025-09-07T07:55:16.1579898Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1580185Z "size": 104, 2025-09-07T07:55:16.1580456Z "digest": "sha256:951e8ce838415c4257680a9d60d216f3750cbb18d243d9a21e2008cce7e589cf" 2025-09-07T07:55:16.1580767Z }, 2025-09-07T07:55:16.1580902Z { 2025-09-07T07:55:16.1581125Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1581498Z "size": 408, 2025-09-07T07:55:16.1581778Z "digest": "sha256:32340b97ae50ba7b2918ab40d6f4a8db875afee69318f484e4deb0a1e2ec4beb" 2025-09-07T07:55:16.1582103Z }, 2025-09-07T07:55:16.1582240Z { 2025-09-07T07:55:16.1582460Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1582738Z "size": 32, 2025-09-07T07:55:16.1583013Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:55:16.1583341Z }, 2025-09-07T07:55:16.1583476Z { 2025-09-07T07:55:16.1583693Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1583981Z "size": 109, 2025-09-07T07:55:16.1584263Z "digest": "sha256:5bbb04cd6b57ae13d7cf05ab9e9b4ed9752833ee2dba4eeaac47bde6022c4725" 2025-09-07T07:55:16.1584588Z }, 2025-09-07T07:55:16.1584714Z { 2025-09-07T07:55:16.1584930Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1585351Z "size": 1897, 2025-09-07T07:55:16.1585635Z "digest": "sha256:d8c4b845cfc7ca7cc0604f472bf6da8b1f1d4e98dff3c76e1985a7013a5b9e3f" 2025-09-07T07:55:16.1585956Z }, 2025-09-07T07:55:16.1586091Z { 2025-09-07T07:55:16.1586308Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1586590Z "size": 243440375, 2025-09-07T07:55:16.1586873Z "digest": "sha256:b35c180f4d8ddc2396eac4a6b893f438481a8163ceb0b88f203488bc5f2a8ba4" 2025-09-07T07:55:16.1587207Z }, 2025-09-07T07:55:16.1587490Z { 2025-09-07T07:55:16.1587709Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1587982Z "size": 106, 2025-09-07T07:55:16.1588258Z "digest": "sha256:5f967b3c303a99e609441551f7c8988cca4fd464c0c3127506bff8509583091b" 2025-09-07T07:55:16.1588574Z }, 2025-09-07T07:55:16.1588705Z { 2025-09-07T07:55:16.1588916Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1589196Z "size": 166, 2025-09-07T07:55:16.1589470Z "digest": "sha256:04770904f012e5584f1c19a0bc92d9863baaebf08bf75b4a9981f2b7795c8953" 2025-09-07T07:55:16.1589790Z }, 2025-09-07T07:55:16.1589920Z { 2025-09-07T07:55:16.1590137Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1590428Z "size": 7943, 2025-09-07T07:55:16.1590714Z "digest": "sha256:73373941fb321b4cb4a171b1423a68a4c7fedada3a1498868d7efe93cb03170e" 2025-09-07T07:55:16.1591029Z }, 2025-09-07T07:55:16.1591164Z { 2025-09-07T07:55:16.1591384Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1591690Z "size": 8072, 2025-09-07T07:55:16.1591964Z "digest": "sha256:9572e6cd907bfa4888456dbccc6e22146a0044374585f3fa0a8ced19b831ed62" 2025-09-07T07:55:16.1592284Z }, 2025-09-07T07:55:16.1592420Z { 2025-09-07T07:55:16.1592639Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1592922Z "size": 304, 2025-09-07T07:55:16.1593204Z "digest": "sha256:64a544aba233551e38898f138dd6ba3161ccdb9554e0ffb5b9d8f0f7fe4a7fa8" 2025-09-07T07:55:16.1593527Z }, 2025-09-07T07:55:16.1593662Z { 2025-09-07T07:55:16.1594015Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1594307Z "size": 13362696, 2025-09-07T07:55:16.1594589Z "digest": "sha256:7e35418a24997de5428763c93826679486760a1a9563209ae64de66ba45f99c1" 2025-09-07T07:55:16.1594903Z }, 2025-09-07T07:55:16.1595181Z { 2025-09-07T07:55:16.1595404Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1595695Z "size": 108, 2025-09-07T07:55:16.1595976Z "digest": "sha256:2ed8e82748d4a1131f41d9e41322f47a6ffef67a5a2b7bf5392237db5c035c61" 2025-09-07T07:55:16.1596292Z }, 2025-09-07T07:55:16.1596424Z { 2025-09-07T07:55:16.1596641Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1596919Z "size": 54145663, 2025-09-07T07:55:16.1597199Z "digest": "sha256:c988fbcccd708fb158a81c429d32e1060a7e40924fc3c987c629fa69d9484717" 2025-09-07T07:55:16.1597519Z }, 2025-09-07T07:55:16.1597651Z { 2025-09-07T07:55:16.1597874Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-09-07T07:55:16.1598150Z "size": 32, 2025-09-07T07:55:16.1598429Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-09-07T07:55:16.1598750Z } 2025-09-07T07:55:16.1598883Z ] 2025-09-07T07:55:16.1599011Z } 2025-09-07T07:55:16.1599171Z + exit 0 2025-09-07T07:55:16.1637527Z ##[group]Run set -eux 2025-09-07T07:55:16.1637743Z set -eux 2025-09-07T07:55:16.1638045Z # It's ok if this steps fails, it would then be an anonymous user like what we used to have 2025-09-07T07:55:16.1638873Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin || true 2025-09-07T07:55:16.1654631Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:55:16.1654923Z env: 2025-09-07T07:55:16.1655238Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:16.1655438Z ##[endgroup] 2025-09-07T07:55:16.1691120Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2025-09-07T07:55:16.1691888Z + jq --raw-output .SecretString 2025-09-07T07:55:16.1694653Z + jq -r .docker_hub_readonly_token 2025-09-07T07:55:16.1697064Z + docker login --username pytorchbot --password-stdin 2025-09-07T07:55:16.7457242Z 2025-09-07T07:55:16.7460481Z An error occurred (AccessDeniedException) when calling the GetSecretValue operation: User: arn:aws:sts::308535385114:assumed-role/gh-ci-github-action-runners-runner-role/i-0d73070610f53945f is not authorized to perform: secretsmanager:GetSecretValue on resource: docker_hub_readonly_token because no identity-based policy allows the secretsmanager:GetSecretValue action 2025-09-07T07:55:16.8201159Z Error: Cannot perform an interactive login from a non TTY device 2025-09-07T07:55:16.8217937Z + true 2025-09-07T07:55:16.8280529Z ##[group]Run tag=${ECR_DOCKER_IMAGE##*:} 2025-09-07T07:55:16.8280846Z tag=${ECR_DOCKER_IMAGE##*:} 2025-09-07T07:55:16.8281148Z echo "docker pull ghcr.io/pytorch/ci-image:${tag/:/-}" 2025-09-07T07:55:16.8295900Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:55:16.8296195Z env: 2025-09-07T07:55:16.8296352Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:16.8297032Z ECR_DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:16.8297748Z ##[endgroup] 2025-09-07T07:55:16.8331824Z docker pull ghcr.io/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:16.8367220Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2025-09-07T07:55:16.8367580Z with: 2025-09-07T07:55:16.8368249Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:16.8369009Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:55:16.8369313Z env: 2025-09-07T07:55:16.8369490Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:16.8369703Z ##[endgroup] 2025-09-07T07:55:16.8381852Z ##[group]Run set -x 2025-09-07T07:55:16.8382060Z set -x 2025-09-07T07:55:16.8382232Z set +e 2025-09-07T07:55:16.8382406Z  2025-09-07T07:55:16.8382552Z login() { 2025-09-07T07:55:16.8382905Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-09-07T07:55:16.8383277Z } 2025-09-07T07:55:16.8383429Z  2025-09-07T07:55:16.8383606Z retry () { 2025-09-07T07:55:16.8383802Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-09-07T07:55:16.8384023Z } 2025-09-07T07:55:16.8384175Z  2025-09-07T07:55:16.8384344Z retry login "${DOCKER_REGISTRY}" 2025-09-07T07:55:16.8384561Z  2025-09-07T07:55:16.8384904Z IMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024') 2025-09-07T07:55:16.8385553Z echo "Compressed size of image in MB: ${IMAGE_SIZE}" 2025-09-07T07:55:16.8385825Z  2025-09-07T07:55:16.8385976Z set -e 2025-09-07T07:55:16.8386217Z # ignore output since only exit code is used for conditional 2025-09-07T07:55:16.8386577Z # only pull docker image if it's not available locally 2025-09-07T07:55:16.8386961Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2025-09-07T07:55:16.8387326Z  retry docker pull "${DOCKER_IMAGE}" 2025-09-07T07:55:16.8387553Z fi 2025-09-07T07:55:16.8400618Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T07:55:16.8400905Z env: 2025-09-07T07:55:16.8401067Z GIT_DEFAULT_BRANCH: main 2025-09-07T07:55:16.8401745Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:16.8402499Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:55:16.8402776Z ##[endgroup] 2025-09-07T07:55:16.8430895Z + set +e 2025-09-07T07:55:16.8431260Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:55:16.8431919Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:55:16.8434747Z + aws ecr get-login-password --region us-east-1 2025-09-07T07:55:16.8436181Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-09-07T07:55:17.6346010Z 2025-09-07T07:55:17.6346587Z WARNING! Your credentials are stored unencrypted in '/home/david/.docker/config.json'. 2025-09-07T07:55:17.6347125Z Configure a credential helper to remove this warning. See 2025-09-07T07:55:17.6347496Z https://docs.docker.com/go/credential-store/ 2025-09-07T07:55:17.6347702Z 2025-09-07T07:55:17.6347787Z Login Succeeded 2025-09-07T07:55:17.6374096Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:17.6375218Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024' 2025-09-07T07:55:18.0294735Z + IMAGE_SIZE=36183.606596946716 2025-09-07T07:55:18.0295201Z + echo 'Compressed size of image in MB: 36183.606596946716' 2025-09-07T07:55:18.0295512Z + set -e 2025-09-07T07:55:18.0295738Z Compressed size of image in MB: 36183.606596946716 2025-09-07T07:55:18.0296976Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:18.0423238Z + retry docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:18.0424472Z + docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T07:55:18.4388038Z pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77: Pulling from pytorch/ci-image 2025-09-07T07:55:18.4389170Z e6fdc8487bfe: Pulling fs layer 2025-09-07T07:55:18.4389456Z 171dcef20c49: Pulling fs layer 2025-09-07T07:55:18.4389712Z 4c92b3f72f1d: Pulling fs layer 2025-09-07T07:55:18.4389959Z 744f9ba90a65: Pulling fs layer 2025-09-07T07:55:18.4390207Z d3c08322a332: Pulling fs layer 2025-09-07T07:55:18.4390457Z ffd43b71f3cc: Pulling fs layer 2025-09-07T07:55:18.4390702Z 830692b57f6e: Pulling fs layer 2025-09-07T07:55:18.4390962Z 5bad36d18468: Pulling fs layer 2025-09-07T07:55:18.4391219Z 0e34fdd9ac5c: Pulling fs layer 2025-09-07T07:55:18.4391517Z 3c868a62868e: Pulling fs layer 2025-09-07T07:55:18.4391802Z 62170a22dd57: Pulling fs layer 2025-09-07T07:55:18.4392100Z 553c1d23b6c4: Pulling fs layer 2025-09-07T07:55:18.4392332Z 9408d557a804: Pulling fs layer 2025-09-07T07:55:18.4392558Z df607cfc7c07: Pulling fs layer 2025-09-07T07:55:18.4392794Z 4f4fb700ef54: Pulling fs layer 2025-09-07T07:55:18.4393028Z 40a8e39faeda: Pulling fs layer 2025-09-07T07:55:18.4393255Z 830692b57f6e: Waiting 2025-09-07T07:55:18.4393479Z d895771c9fac: Pulling fs layer 2025-09-07T07:55:18.4393704Z 3c868a62868e: Waiting 2025-09-07T07:55:18.4393906Z 5bad36d18468: Waiting 2025-09-07T07:55:18.4394096Z 9408d557a804: Waiting 2025-09-07T07:55:18.4394293Z 62170a22dd57: Waiting 2025-09-07T07:55:18.4394505Z 553c1d23b6c4: Waiting 2025-09-07T07:55:18.4394712Z c4ee04f39d49: Pulling fs layer 2025-09-07T07:55:18.4395091Z 3690c9826e48: Pulling fs layer 2025-09-07T07:55:18.4395326Z 57cbc5013733: Pulling fs layer 2025-09-07T07:55:18.4395563Z f5f4b06b58bb: Pulling fs layer 2025-09-07T07:55:18.4395795Z 0e34fdd9ac5c: Waiting 2025-09-07T07:55:18.4395990Z df607cfc7c07: Waiting 2025-09-07T07:55:18.4396199Z f59713ce4bf4: Pulling fs layer 2025-09-07T07:55:18.4396444Z fe0486521517: Pulling fs layer 2025-09-07T07:55:18.4396668Z 4f4fb700ef54: Waiting 2025-09-07T07:55:18.4396871Z 8c21cc3715a2: Pulling fs layer 2025-09-07T07:55:18.4397136Z 3690c9826e48: Waiting 2025-09-07T07:55:18.4397346Z d37c58456a6a: Pulling fs layer 2025-09-07T07:55:18.4397872Z 57cbc5013733: Waiting 2025-09-07T07:55:18.4398084Z f59713ce4bf4: Waiting 2025-09-07T07:55:18.4398289Z d042f63abc13: Pulling fs layer 2025-09-07T07:55:18.4398521Z 744f9ba90a65: Waiting 2025-09-07T07:55:18.4398725Z 621284a9c05a: Pulling fs layer 2025-09-07T07:55:18.4398953Z d895771c9fac: Waiting 2025-09-07T07:55:18.4399151Z c4ee04f39d49: Waiting 2025-09-07T07:55:18.4399355Z ffd43b71f3cc: Waiting 2025-09-07T07:55:18.4399555Z fe0486521517: Waiting 2025-09-07T07:55:18.4399760Z 85f605d2dd3a: Pulling fs layer 2025-09-07T07:55:18.4399978Z d3c08322a332: Waiting 2025-09-07T07:55:18.4400177Z 8c21cc3715a2: Waiting 2025-09-07T07:55:18.4400374Z d37c58456a6a: Waiting 2025-09-07T07:55:18.4400581Z 381b5539e598: Pulling fs layer 2025-09-07T07:55:18.4400809Z d042f63abc13: Waiting 2025-09-07T07:55:18.4401012Z 85f605d2dd3a: Waiting 2025-09-07T07:55:18.4401219Z a487c0c80029: Pulling fs layer 2025-09-07T07:55:18.4401433Z 48bcb81e2566: Pulling fs layer 2025-09-07T07:55:18.4401612Z 381b5539e598: Waiting 2025-09-07T07:55:18.4401777Z 40a8e39faeda: Waiting 2025-09-07T07:55:18.4401950Z e261928c0043: Pulling fs layer 2025-09-07T07:55:18.4402132Z 48bcb81e2566: Waiting 2025-09-07T07:55:18.4402284Z a487c0c80029: Waiting 2025-09-07T07:55:18.4402451Z 0fea55428091: Pulling fs layer 2025-09-07T07:55:18.4402635Z e261928c0043: Waiting 2025-09-07T07:55:18.4402974Z 0fea55428091: Waiting 2025-09-07T07:55:18.4403145Z b4291bccbb84: Pulling fs layer 2025-09-07T07:55:18.4403340Z ddc91b09189a: Pulling fs layer 2025-09-07T07:55:18.4403543Z 7540c7428627: Pulling fs layer 2025-09-07T07:55:18.4403729Z f5f4b06b58bb: Waiting 2025-09-07T07:55:18.4403884Z b4291bccbb84: Waiting 2025-09-07T07:55:18.4404054Z 003c4e2598fb: Pulling fs layer 2025-09-07T07:55:18.4404238Z ddc91b09189a: Waiting 2025-09-07T07:55:18.4404400Z 7540c7428627: Waiting 2025-09-07T07:55:18.4404560Z 5687149362ae: Pulling fs layer 2025-09-07T07:55:18.4404753Z cdd2cf54eb2a: Pulling fs layer 2025-09-07T07:55:18.4405076Z 621284a9c05a: Waiting 2025-09-07T07:55:18.4405258Z d3ad4df1ba3a: Pulling fs layer 2025-09-07T07:55:18.4405457Z 3c9055753b4c: Pulling fs layer 2025-09-07T07:55:18.4405644Z 003c4e2598fb: Waiting 2025-09-07T07:55:18.4405814Z 5687149362ae: Waiting 2025-09-07T07:55:18.4405973Z cdd2cf54eb2a: Waiting 2025-09-07T07:55:18.4406148Z 31cf8d0bd21c: Pulling fs layer 2025-09-07T07:55:18.4406343Z 6623ea814971: Pulling fs layer 2025-09-07T07:55:18.4406526Z 3c9055753b4c: Waiting 2025-09-07T07:55:18.4406691Z 11696c3aa380: Pulling fs layer 2025-09-07T07:55:18.4406877Z 31cf8d0bd21c: Waiting 2025-09-07T07:55:18.4407051Z ef4d544e35ca: Pulling fs layer 2025-09-07T07:55:18.4407238Z d3ad4df1ba3a: Waiting 2025-09-07T07:55:18.4407395Z 6623ea814971: Waiting 2025-09-07T07:55:18.4407557Z 11696c3aa380: Waiting 2025-09-07T07:55:18.4407723Z 5c5108865e5e: Pulling fs layer 2025-09-07T07:55:18.4407909Z ef4d544e35ca: Waiting 2025-09-07T07:55:18.4408073Z 9e97578e9edf: Pulling fs layer 2025-09-07T07:55:18.4408263Z 5c5108865e5e: Waiting 2025-09-07T07:55:18.4408435Z da5a91b54cb5: Pulling fs layer 2025-09-07T07:55:18.4408642Z 9e97578e9edf: Waiting 2025-09-07T07:55:18.4408802Z da5a91b54cb5: Waiting 2025-09-07T07:55:18.4408976Z 1e93be219e89: Pulling fs layer 2025-09-07T07:55:18.4409171Z 136825afebb5: Pulling fs layer 2025-09-07T07:55:18.4409367Z 22b39805302d: Pulling fs layer 2025-09-07T07:55:18.4409561Z d12add675e35: Pulling fs layer 2025-09-07T07:55:18.4409758Z bc127046d33a: Pulling fs layer 2025-09-07T07:55:18.4409944Z 1e93be219e89: Waiting 2025-09-07T07:55:18.4410110Z d12add675e35: Waiting 2025-09-07T07:55:18.4410266Z bc127046d33a: Waiting 2025-09-07T07:55:18.4410436Z 951e8ce83841: Pulling fs layer 2025-09-07T07:55:18.4410621Z 22b39805302d: Waiting 2025-09-07T07:55:18.4410798Z 32340b97ae50: Pulling fs layer 2025-09-07T07:55:18.4410984Z 5bbb04cd6b57: Pulling fs layer 2025-09-07T07:55:18.4411194Z d8c4b845cfc7: Pulling fs layer 2025-09-07T07:55:18.4411383Z 951e8ce83841: Waiting 2025-09-07T07:55:18.4411555Z b35c180f4d8d: Pulling fs layer 2025-09-07T07:55:18.4411741Z 5f967b3c303a: Pulling fs layer 2025-09-07T07:55:18.4412080Z 32340b97ae50: Waiting 2025-09-07T07:55:18.4412248Z 04770904f012: Pulling fs layer 2025-09-07T07:55:18.4412436Z 73373941fb32: Pulling fs layer 2025-09-07T07:55:18.4412613Z b35c180f4d8d: Waiting 2025-09-07T07:55:18.4412782Z 9572e6cd907b: Pulling fs layer 2025-09-07T07:55:18.4412964Z 04770904f012: Waiting 2025-09-07T07:55:18.4413139Z 5bbb04cd6b57: Waiting 2025-09-07T07:55:18.4413312Z 64a544aba233: Pulling fs layer 2025-09-07T07:55:18.4413503Z 7e35418a2499: Pulling fs layer 2025-09-07T07:55:18.4413687Z 73373941fb32: Waiting 2025-09-07T07:55:18.4413848Z 9572e6cd907b: Waiting 2025-09-07T07:55:18.4414007Z 5f967b3c303a: Waiting 2025-09-07T07:55:18.4414175Z 2ed8e82748d4: Pulling fs layer 2025-09-07T07:55:18.4414362Z 64a544aba233: Waiting 2025-09-07T07:55:18.4414534Z c988fbcccd70: Pulling fs layer 2025-09-07T07:55:18.4414725Z d8c4b845cfc7: Waiting 2025-09-07T07:55:18.4414886Z 7e35418a2499: Waiting 2025-09-07T07:55:18.4415179Z 2ed8e82748d4: Waiting 2025-09-07T07:55:18.6029331Z 171dcef20c49: Verifying Checksum 2025-09-07T07:55:18.6029857Z 171dcef20c49: Download complete 2025-09-07T07:55:18.7599704Z 744f9ba90a65: Download complete 2025-09-07T07:55:18.8914916Z e6fdc8487bfe: Verifying Checksum 2025-09-07T07:55:18.8915375Z e6fdc8487bfe: Download complete 2025-09-07T07:55:18.9255229Z d3c08322a332: Verifying Checksum 2025-09-07T07:55:18.9255904Z d3c08322a332: Download complete 2025-09-07T07:55:19.0621077Z ffd43b71f3cc: Download complete 2025-09-07T07:55:19.0962874Z 830692b57f6e: Verifying Checksum 2025-09-07T07:55:19.0963185Z 830692b57f6e: Download complete 2025-09-07T07:55:19.2307842Z 5bad36d18468: Download complete 2025-09-07T07:55:19.3961564Z 3c868a62868e: Download complete 2025-09-07T07:55:19.5501069Z 62170a22dd57: Download complete 2025-09-07T07:55:19.6467441Z e6fdc8487bfe: Pull complete 2025-09-07T07:55:19.6886140Z 171dcef20c49: Pull complete 2025-09-07T07:55:19.7157604Z 553c1d23b6c4: Verifying Checksum 2025-09-07T07:55:19.7157907Z 553c1d23b6c4: Download complete 2025-09-07T07:55:19.8868879Z 9408d557a804: Verifying Checksum 2025-09-07T07:55:19.8869225Z 9408d557a804: Download complete 2025-09-07T07:55:20.3364659Z 0e34fdd9ac5c: Verifying Checksum 2025-09-07T07:55:20.3365230Z 0e34fdd9ac5c: Download complete 2025-09-07T07:55:20.3956771Z 4f4fb700ef54: Verifying Checksum 2025-09-07T07:55:20.3957109Z 4f4fb700ef54: Download complete 2025-09-07T07:55:20.5382238Z 40a8e39faeda: Verifying Checksum 2025-09-07T07:55:20.5382551Z 40a8e39faeda: Download complete 2025-09-07T07:55:21.2090488Z d895771c9fac: Verifying Checksum 2025-09-07T07:55:21.2090827Z d895771c9fac: Download complete 2025-09-07T07:55:21.3649216Z c4ee04f39d49: Verifying Checksum 2025-09-07T07:55:21.3649536Z c4ee04f39d49: Download complete 2025-09-07T07:55:21.6236132Z 3690c9826e48: Verifying Checksum 2025-09-07T07:55:21.6236409Z 3690c9826e48: Download complete 2025-09-07T07:55:21.7178354Z 4c92b3f72f1d: Verifying Checksum 2025-09-07T07:55:21.7178649Z 4c92b3f72f1d: Download complete 2025-09-07T07:55:21.7912231Z 57cbc5013733: Download complete 2025-09-07T07:55:21.8856222Z f5f4b06b58bb: Verifying Checksum 2025-09-07T07:55:21.8856536Z f5f4b06b58bb: Download complete 2025-09-07T07:55:21.9572629Z f59713ce4bf4: Verifying Checksum 2025-09-07T07:55:21.9572953Z f59713ce4bf4: Download complete 2025-09-07T07:55:22.0473926Z fe0486521517: Verifying Checksum 2025-09-07T07:55:22.0474253Z fe0486521517: Download complete 2025-09-07T07:55:22.2115727Z d37c58456a6a: Verifying Checksum 2025-09-07T07:55:22.2116035Z d37c58456a6a: Download complete 2025-09-07T07:55:22.6990192Z d042f63abc13: Verifying Checksum 2025-09-07T07:55:22.6990501Z d042f63abc13: Download complete 2025-09-07T07:55:22.8417010Z 621284a9c05a: Download complete 2025-09-07T07:55:22.9989961Z 85f605d2dd3a: Download complete 2025-09-07T07:55:27.6902559Z 381b5539e598: Verifying Checksum 2025-09-07T07:55:27.6902908Z 381b5539e598: Download complete 2025-09-07T07:55:27.8440446Z a487c0c80029: Download complete 2025-09-07T07:55:27.9477170Z 4c92b3f72f1d: Pull complete 2025-09-07T07:55:27.9910585Z 744f9ba90a65: Pull complete 2025-09-07T07:55:27.9976263Z 48bcb81e2566: Verifying Checksum 2025-09-07T07:55:27.9976547Z 48bcb81e2566: Download complete 2025-09-07T07:55:28.0293351Z d3c08322a332: Pull complete 2025-09-07T07:55:28.0727770Z ffd43b71f3cc: Pull complete 2025-09-07T07:55:28.1161728Z 830692b57f6e: Pull complete 2025-09-07T07:55:28.1507011Z 5bad36d18468: Pull complete 2025-09-07T07:55:28.1574076Z e261928c0043: Download complete 2025-09-07T07:55:28.3843673Z 0fea55428091: Download complete 2025-09-07T07:55:28.7285996Z b4291bccbb84: Verifying Checksum 2025-09-07T07:55:28.7286365Z b4291bccbb84: Download complete 2025-09-07T07:55:28.8932754Z ddc91b09189a: Download complete 2025-09-07T07:55:29.0627712Z 7540c7428627: Verifying Checksum 2025-09-07T07:55:29.0628079Z 7540c7428627: Download complete 2025-09-07T07:55:29.3703099Z 003c4e2598fb: Verifying Checksum 2025-09-07T07:55:29.3703460Z 003c4e2598fb: Download complete 2025-09-07T07:55:29.5356672Z 5687149362ae: Download complete 2025-09-07T07:55:29.7068553Z cdd2cf54eb2a: Verifying Checksum 2025-09-07T07:55:29.7068910Z cdd2cf54eb2a: Download complete 2025-09-07T07:55:31.8592424Z 0e34fdd9ac5c: Pull complete 2025-09-07T07:55:34.6859942Z 3c868a62868e: Pull complete 2025-09-07T07:55:38.7994604Z 62170a22dd57: Pull complete 2025-09-07T07:55:43.3117855Z 553c1d23b6c4: Pull complete 2025-09-07T07:55:48.8629512Z 9408d557a804: Pull complete 2025-09-07T07:55:52.4546305Z df607cfc7c07: Verifying Checksum 2025-09-07T07:55:52.4546673Z df607cfc7c07: Download complete 2025-09-07T07:55:52.6158419Z 3c9055753b4c: Verifying Checksum 2025-09-07T07:55:52.6158741Z 3c9055753b4c: Download complete 2025-09-07T07:55:56.3296566Z 31cf8d0bd21c: Verifying Checksum 2025-09-07T07:55:56.3296955Z 31cf8d0bd21c: Download complete 2025-09-07T07:56:35.7046603Z df607cfc7c07: Pull complete 2025-09-07T07:56:38.8484613Z 4f4fb700ef54: Pull complete 2025-09-07T07:56:40.9751096Z 8c21cc3715a2: Verifying Checksum 2025-09-07T07:56:40.9751465Z 8c21cc3715a2: Download complete 2025-09-07T07:56:41.2253652Z 11696c3aa380: Verifying Checksum 2025-09-07T07:56:41.2253955Z 11696c3aa380: Download complete 2025-09-07T07:56:41.5376277Z ef4d544e35ca: Verifying Checksum 2025-09-07T07:56:41.5376576Z ef4d544e35ca: Download complete 2025-09-07T07:56:41.6941431Z 5c5108865e5e: Verifying Checksum 2025-09-07T07:56:41.6941723Z 5c5108865e5e: Download complete 2025-09-07T07:56:41.9599912Z 9e97578e9edf: Download complete 2025-09-07T07:56:42.2156304Z da5a91b54cb5: Download complete 2025-09-07T07:56:42.3710282Z 1e93be219e89: Verifying Checksum 2025-09-07T07:56:42.3710603Z 1e93be219e89: Download complete 2025-09-07T07:56:42.6531463Z 136825afebb5: Verifying Checksum 2025-09-07T07:56:42.6531933Z 136825afebb5: Download complete 2025-09-07T07:56:42.9833144Z 22b39805302d: Verifying Checksum 2025-09-07T07:56:42.9833487Z 22b39805302d: Download complete 2025-09-07T07:56:43.1525451Z d12add675e35: Verifying Checksum 2025-09-07T07:56:43.1525780Z d12add675e35: Download complete 2025-09-07T07:56:43.1968777Z 40a8e39faeda: Pull complete 2025-09-07T07:56:43.3398866Z bc127046d33a: Verifying Checksum 2025-09-07T07:56:43.3399212Z bc127046d33a: Download complete 2025-09-07T07:56:43.5029203Z 951e8ce83841: Verifying Checksum 2025-09-07T07:56:43.5029457Z 951e8ce83841: Download complete 2025-09-07T07:56:43.7405636Z 32340b97ae50: Verifying Checksum 2025-09-07T07:56:43.7405941Z 32340b97ae50: Download complete 2025-09-07T07:56:44.0473428Z 5bbb04cd6b57: Verifying Checksum 2025-09-07T07:56:44.0473724Z 5bbb04cd6b57: Download complete 2025-09-07T07:56:44.3116609Z d8c4b845cfc7: Verifying Checksum 2025-09-07T07:56:44.3116930Z d8c4b845cfc7: Download complete 2025-09-07T07:56:47.1239902Z b35c180f4d8d: Verifying Checksum 2025-09-07T07:56:47.1240280Z b35c180f4d8d: Download complete 2025-09-07T07:56:47.2989726Z 5f967b3c303a: Verifying Checksum 2025-09-07T07:56:47.2990029Z 5f967b3c303a: Download complete 2025-09-07T07:56:47.6768725Z 04770904f012: Verifying Checksum 2025-09-07T07:56:47.6769063Z 04770904f012: Download complete 2025-09-07T07:56:47.9059761Z 73373941fb32: Verifying Checksum 2025-09-07T07:56:47.9060032Z 73373941fb32: Download complete 2025-09-07T07:56:48.0882437Z 9572e6cd907b: Verifying Checksum 2025-09-07T07:56:48.0882880Z 9572e6cd907b: Download complete 2025-09-07T07:56:48.2252979Z 64a544aba233: Verifying Checksum 2025-09-07T07:56:48.2253242Z 64a544aba233: Download complete 2025-09-07T07:56:48.2790691Z d895771c9fac: Pull complete 2025-09-07T07:56:48.5273409Z 7e35418a2499: Verifying Checksum 2025-09-07T07:56:48.5273690Z 7e35418a2499: Download complete 2025-09-07T07:56:48.6671493Z 2ed8e82748d4: Verifying Checksum 2025-09-07T07:56:48.6671760Z 2ed8e82748d4: Download complete 2025-09-07T07:56:49.5425236Z c988fbcccd70: Verifying Checksum 2025-09-07T07:56:49.5425591Z c988fbcccd70: Download complete 2025-09-07T07:56:52.2068908Z c4ee04f39d49: Pull complete 2025-09-07T07:56:55.7051279Z 3690c9826e48: Pull complete 2025-09-07T07:57:00.1220876Z 57cbc5013733: Pull complete 2025-09-07T07:57:01.6843116Z 6623ea814971: Verifying Checksum 2025-09-07T07:57:01.6843425Z 6623ea814971: Download complete 2025-09-07T07:57:03.5775508Z f5f4b06b58bb: Pull complete 2025-09-07T07:57:07.4577251Z f59713ce4bf4: Pull complete 2025-09-07T07:57:10.9994355Z fe0486521517: Pull complete 2025-09-07T07:59:11.0643942Z d3ad4df1ba3a: Verifying Checksum 2025-09-07T07:59:11.0644320Z d3ad4df1ba3a: Download complete 2025-09-07T08:00:01.9832044Z 8c21cc3715a2: Pull complete 2025-09-07T08:00:05.5368336Z d37c58456a6a: Pull complete 2025-09-07T08:00:09.8059250Z d042f63abc13: Pull complete 2025-09-07T08:00:13.6105331Z 621284a9c05a: Pull complete 2025-09-07T08:00:16.6910166Z 85f605d2dd3a: Pull complete 2025-09-07T08:00:26.9405841Z 381b5539e598: Pull complete 2025-09-07T08:00:29.8458327Z a487c0c80029: Pull complete 2025-09-07T08:00:33.0423112Z 48bcb81e2566: Pull complete 2025-09-07T08:00:39.6973553Z e261928c0043: Pull complete 2025-09-07T08:00:43.4316922Z 0fea55428091: Pull complete 2025-09-07T08:00:47.2352808Z b4291bccbb84: Pull complete 2025-09-07T08:00:50.5306408Z ddc91b09189a: Pull complete 2025-09-07T08:00:55.0054002Z 7540c7428627: Pull complete 2025-09-07T08:01:04.8220666Z 003c4e2598fb: Pull complete 2025-09-07T08:01:09.6458346Z 5687149362ae: Pull complete 2025-09-07T08:01:12.4426223Z cdd2cf54eb2a: Pull complete 2025-09-07T08:05:54.4484099Z d3ad4df1ba3a: Pull complete 2025-09-07T08:05:58.1829790Z 3c9055753b4c: Pull complete 2025-09-07T08:06:02.9683336Z 31cf8d0bd21c: Pull complete 2025-09-07T08:07:44.2857178Z 6623ea814971: Pull complete 2025-09-07T08:07:46.7853151Z 11696c3aa380: Pull complete 2025-09-07T08:07:50.2900636Z ef4d544e35ca: Pull complete 2025-09-07T08:07:57.0419001Z 5c5108865e5e: Pull complete 2025-09-07T08:08:02.1528523Z 9e97578e9edf: Pull complete 2025-09-07T08:08:03.0644292Z da5a91b54cb5: Pull complete 2025-09-07T08:08:04.1090830Z 1e93be219e89: Pull complete 2025-09-07T08:08:08.5823798Z 136825afebb5: Pull complete 2025-09-07T08:08:09.1294725Z 22b39805302d: Pull complete 2025-09-07T08:08:09.5926203Z d12add675e35: Pull complete 2025-09-07T08:08:10.7326200Z bc127046d33a: Pull complete 2025-09-07T08:08:13.7401172Z 951e8ce83841: Pull complete 2025-09-07T08:08:14.9483909Z 32340b97ae50: Pull complete 2025-09-07T08:08:15.7419007Z 5bbb04cd6b57: Pull complete 2025-09-07T08:08:16.5349678Z d8c4b845cfc7: Pull complete 2025-09-07T08:08:25.1054883Z b35c180f4d8d: Pull complete 2025-09-07T08:08:27.9093237Z 5f967b3c303a: Pull complete 2025-09-07T08:08:30.3059030Z 04770904f012: Pull complete 2025-09-07T08:08:33.4722246Z 73373941fb32: Pull complete 2025-09-07T08:08:36.5631103Z 9572e6cd907b: Pull complete 2025-09-07T08:08:39.1572069Z 64a544aba233: Pull complete 2025-09-07T08:08:43.4218769Z 7e35418a2499: Pull complete 2025-09-07T08:08:47.0993604Z 2ed8e82748d4: Pull complete 2025-09-07T08:08:51.3776227Z c988fbcccd70: Pull complete 2025-09-07T08:08:54.8678609Z Digest: sha256:f30843ff9ea9e117a2c8e6d207e85c9e77dfe682f1dfcdfea5b94178d1bf00b3 2025-09-07T08:08:55.1450295Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T08:08:55.1616042Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T08:08:55.1681501Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T08:08:55.1682272Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-09-07T08:08:55.1697224Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:08:55.1697508Z env: 2025-09-07T08:08:55.1697666Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:08:55.1697864Z ##[endgroup] 2025-09-07T08:08:55.2132037Z ##[group]Run echo "GPU_FLAG=--gpus all -e NVIDIA_DRIVER_CAPABILITIES=all" >> "${GITHUB_ENV}" 2025-09-07T08:08:55.2132609Z echo "GPU_FLAG=--gpus all -e NVIDIA_DRIVER_CAPABILITIES=all" >> "${GITHUB_ENV}" 2025-09-07T08:08:55.2147700Z shell: /usr/bin/bash -e {0} 2025-09-07T08:08:55.2147926Z env: 2025-09-07T08:08:55.2148086Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:08:55.2148280Z ##[endgroup] 2025-09-07T08:08:55.2456543Z ##[group]Run echo "SCCACHE_SERVER_PORT_DOCKER_FLAG=-e SCCACHE_SERVER_PORT=$((RUNNER_UID + 4226))" >> "${GITHUB_ENV}" 2025-09-07T08:08:55.2457272Z echo "SCCACHE_SERVER_PORT_DOCKER_FLAG=-e SCCACHE_SERVER_PORT=$((RUNNER_UID + 4226))" >> "${GITHUB_ENV}" 2025-09-07T08:08:55.2471032Z shell: /usr/bin/bash -e {0} 2025-09-07T08:08:55.2471254Z env: 2025-09-07T08:08:55.2471415Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:08:55.2471662Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:08:55.2471940Z ##[endgroup] 2025-09-07T08:08:55.2585719Z Prepare all required actions 2025-09-07T08:08:55.2749090Z ##[group]Run ./.github/actions/get-workflow-job-id 2025-09-07T08:08:55.2749338Z with: 2025-09-07T08:08:55.2749836Z github-token: *** 2025-09-07T08:08:55.2750034Z env: 2025-09-07T08:08:55.2750196Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:08:55.2750450Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:08:55.2750784Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:08:55.2751064Z ##[endgroup] 2025-09-07T08:08:55.2951860Z ##[group]Run set -eux 2025-09-07T08:08:55.2952076Z set -eux 2025-09-07T08:08:55.2952410Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-09-07T08:08:55.2965956Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:08:55.2966248Z env: 2025-09-07T08:08:55.2966416Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:08:55.2966665Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:08:55.2967022Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:08:55.2967507Z GITHUB_TOKEN: *** 2025-09-07T08:08:55.2967690Z ##[endgroup] 2025-09-07T08:08:55.3004676Z + python3 .github/scripts/get_workflow_job_id.py 17525296438 i-0d73070610f53945f-1004 2025-09-07T08:08:56.0139639Z Setting output job-id=49775781836 2025-09-07T08:08:56.0140177Z Setting output job-name=test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T08:08:56.0307931Z ##[group]Run python3 -m pip install psutil==5.9.8 dataclasses_json==0.6.7 nvidia-ml-py==11.525.84 2025-09-07T08:08:56.0308523Z python3 -m pip install psutil==5.9.8 dataclasses_json==0.6.7 nvidia-ml-py==11.525.84 2025-09-07T08:08:56.0309218Z python3 -m tools.stats.monitor --log-interval "$MONITOR_LOG_INTERVAL" --data-collect-interval "$MONITOR_DATA_COLLECT_INTERVAL" > usage_log.txt 2>&1 & 2025-09-07T08:08:56.0309838Z echo "monitor-script-pid=${!}" >> "${GITHUB_OUTPUT}" 2025-09-07T08:08:56.0323960Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:08:56.0324251Z env: 2025-09-07T08:08:56.0324424Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:08:56.0324678Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:08:56.0325165Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:08:56.0325681Z JOB_ID: 49775781836 2025-09-07T08:08:56.0326015Z JOB_NAME: test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T08:08:56.0326382Z WORKFLOW_NAME: inductor-perf-nightly-h100 2025-09-07T08:08:56.0326633Z WORKFLOW_RUN_ID: 17525296438 2025-09-07T08:08:56.0326844Z MONITOR_LOG_INTERVAL: 15 2025-09-07T08:08:56.0327058Z MONITOR_DATA_COLLECT_INTERVAL: 4 2025-09-07T08:08:56.0327263Z ##[endgroup] 2025-09-07T08:08:56.3099593Z Defaulting to user installation because normal site-packages is not writeable 2025-09-07T08:08:56.6233479Z Collecting psutil==5.9.8 2025-09-07T08:08:56.6812112Z Downloading psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB) 2025-09-07T08:08:56.8187370Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 288.2/288.2 KB 2.1 MB/s eta 0:00:00 2025-09-07T08:08:56.9872692Z Collecting dataclasses_json==0.6.7 2025-09-07T08:08:56.9975285Z Downloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB) 2025-09-07T08:08:57.8022910Z Collecting nvidia-ml-py==11.525.84 2025-09-07T08:08:57.8135621Z Downloading nvidia_ml_py-11.525.84-py3-none-any.whl (34 kB) 2025-09-07T08:08:58.0347278Z Collecting marshmallow<4.0.0,>=3.18.0 2025-09-07T08:08:58.0447368Z Downloading marshmallow-3.26.1-py3-none-any.whl (50 kB) 2025-09-07T08:08:58.1489612Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 50.9/50.9 KB 397.4 kB/s eta 0:00:00 2025-09-07T08:08:58.2932907Z Collecting typing-inspect<1,>=0.4.0 2025-09-07T08:08:58.3033198Z Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB) 2025-09-07T08:08:58.6250072Z Collecting packaging>=17.0 2025-09-07T08:08:58.6660376Z Downloading packaging-25.0-py3-none-any.whl (66 kB) 2025-09-07T08:08:58.9565340Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.5/66.5 KB 194.9 kB/s eta 0:00:00 2025-09-07T08:08:59.1962835Z Collecting mypy-extensions>=0.3.0 2025-09-07T08:08:59.2064777Z Downloading mypy_extensions-1.1.0-py3-none-any.whl (5.0 kB) 2025-09-07T08:08:59.7557295Z Collecting typing-extensions>=3.7.4 2025-09-07T08:08:59.7954403Z Downloading typing_extensions-4.15.0-py3-none-any.whl (44 kB) 2025-09-07T08:08:59.8689709Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.6/44.6 KB 480.1 kB/s eta 0:00:00 2025-09-07T08:08:59.9313954Z Installing collected packages: nvidia-ml-py, typing-extensions, psutil, packaging, mypy-extensions, typing-inspect, marshmallow, dataclasses_json 2025-09-07T08:09:04.2176882Z Successfully installed dataclasses_json-0.6.7 marshmallow-3.26.1 mypy-extensions-1.1.0 nvidia-ml-py-11.525.84 packaging-25.0 psutil-5.9.8 typing-extensions-4.15.0 typing-inspect-0.9.0 2025-09-07T08:09:04.2833629Z Prepare all required actions 2025-09-07T08:09:04.2833961Z Getting action download info 2025-09-07T08:09:04.4495618Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-09-07T08:09:05.6389447Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093) 2025-09-07T08:09:08.0207415Z ##[group]Run ./.github/actions/download-build-artifacts 2025-09-07T08:09:08.0207699Z with: 2025-09-07T08:09:08.0207893Z name: linux-jammy-cuda12.8-py3.10-gcc9-sm90 2025-09-07T08:09:08.0208146Z s3-bucket: gha-artifacts 2025-09-07T08:09:08.0208345Z env: 2025-09-07T08:09:08.0208506Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:08.0208760Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:08.0209109Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:08.0209402Z ##[endgroup] 2025-09-07T08:09:08.1142890Z ##[group]Run seemethere/download-artifact-s3@v4 2025-09-07T08:09:08.1143153Z with: 2025-09-07T08:09:08.1143341Z name: linux-jammy-cuda12.8-py3.10-gcc9-sm90 2025-09-07T08:09:08.1143587Z s3-bucket: gha-artifacts 2025-09-07T08:09:08.1143781Z region: us-east-1 2025-09-07T08:09:08.1143937Z env: 2025-09-07T08:09:08.1144104Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:08.1144348Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:08.1144908Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:08.1145351Z ##[endgroup] 2025-09-07T08:09:08.5255119Z (node:9108) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-09-07T08:09:08.5255579Z 2025-09-07T08:09:08.5255772Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-09-07T08:09:08.5256268Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-09-07T08:09:08.5256804Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-09-07T08:09:08.6457852Z Found 1 objects with prefix pytorch/pytorch/17525296438/linux-jammy-cuda12.8-py3.10-gcc9-sm90/ 2025-09-07T08:09:08.6458470Z Starting download (1/1): /home/david/_work/pytorch/pytorch/artifacts.zip 2025-09-07T08:09:17.4034772Z Finished download (1/1): /home/david/_work/pytorch/pytorch/artifacts.zip 2025-09-07T08:09:17.4043181Z Artifact download has finished successfully 2025-09-07T08:09:17.4599348Z ##[group]Run unzip -o artifacts.zip 2025-09-07T08:09:17.4599635Z unzip -o artifacts.zip 2025-09-07T08:09:17.4614254Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:09:17.4614548Z env: 2025-09-07T08:09:17.4614712Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:17.4615126Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:17.4615465Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:17.4615741Z ##[endgroup] 2025-09-07T08:09:17.6884476Z Archive: artifacts.zip 2025-09-07T08:09:17.6885776Z creating: dist/ 2025-09-07T08:09:19.6218733Z inflating: dist/torch-2.9.0a0+git93fb23d-cp310-cp310-linux_x86_64.whl 2025-09-07T08:09:19.6219196Z creating: dist/vision/ 2025-09-07T08:09:19.6326411Z inflating: dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-09-07T08:09:19.6326884Z creating: dist/audio/ 2025-09-07T08:09:19.6381662Z inflating: dist/audio/torchaudio-2.8.0a0+2e30055-cp310-cp310-linux_x86_64.whl 2025-09-07T08:09:19.6382146Z creating: dist/torchrec/ 2025-09-07T08:09:19.6405643Z inflating: dist/torchrec/torchrec-0.3.2-py3-none-any.whl 2025-09-07T08:09:19.6406022Z creating: dist/fbgemm_gpu/ 2025-09-07T08:09:20.4726825Z inflating: dist/fbgemm_gpu/fbgemm_gpu-0.4.1.post421-cp310-cp310-linux_x86_64.whl 2025-09-07T08:09:20.4727315Z creating: dist/ao/ 2025-09-07T08:09:20.4764550Z inflating: dist/ao/torchao-0.7.0+git51c87b6e-py3-none-any.whl 2025-09-07T08:09:20.4884285Z inflating: dist/.ninja_log 2025-09-07T08:09:20.4884799Z creating: build/custom_test_artifacts/ 2025-09-07T08:09:20.4886165Z creating: build/custom_test_artifacts/custom-op-build/ 2025-09-07T08:09:20.4886672Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2025-09-07T08:09:20.4887224Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2025-09-07T08:09:20.4893161Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2025-09-07T08:09:20.4893688Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/ 2025-09-07T08:09:20.4894178Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-09-07T08:09:20.4894706Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-09-07T08:09:20.4895368Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-09-07T08:09:20.4897325Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-09-07T08:09:20.4898548Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-09-07T08:09:20.4899273Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-09-07T08:09:20.4899932Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-09-07T08:09:20.4900839Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-09-07T08:09:20.4902650Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-09-07T08:09:20.4903762Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-09-07T08:09:20.4904621Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-09-07T08:09:20.4906247Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-09-07T08:09:20.4907405Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-09-07T08:09:20.4907988Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/ 2025-09-07T08:09:20.4908502Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/ 2025-09-07T08:09:20.4949325Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-09-07T08:09:20.4988722Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-09-07T08:09:20.4989489Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-09-07T08:09:20.5034450Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-09-07T08:09:20.5035475Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-09-07T08:09:20.5036381Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-09-07T08:09:20.5037303Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-09-07T08:09:20.5038225Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-09-07T08:09:20.5039108Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-09-07T08:09:20.5039985Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-09-07T08:09:20.5040980Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-09-07T08:09:20.5041722Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-09-07T08:09:20.5042414Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-09-07T08:09:20.5043090Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-09-07T08:09:20.5043760Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-09-07T08:09:20.5044417Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-09-07T08:09:20.5045196Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.o 2025-09-07T08:09:20.5045860Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-09-07T08:09:20.5111166Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/a.out 2025-09-07T08:09:20.5111891Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeCUDACompiler.cmake 2025-09-07T08:09:20.5178133Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CUDA.bin 2025-09-07T08:09:20.5179207Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2025-09-07T08:09:20.5179761Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2025-09-07T08:09:20.5180312Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2025-09-07T08:09:20.5180896Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2025-09-07T08:09:20.5181617Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2025-09-07T08:09:20.5182356Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2025-09-07T08:09:20.5183056Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2025-09-07T08:09:20.5183709Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2025-09-07T08:09:20.5184380Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2025-09-07T08:09:20.5185243Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2025-09-07T08:09:20.5185922Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2025-09-07T08:09:20.5186590Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2025-09-07T08:09:20.5187244Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2025-09-07T08:09:20.5203374Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2025-09-07T08:09:20.5389712Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2025-09-07T08:09:20.5390485Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2025-09-07T08:09:20.5391163Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2025-09-07T08:09:20.5391946Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2025-09-07T08:09:20.5392673Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2025-09-07T08:09:20.5393353Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2025-09-07T08:09:20.5394065Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2025-09-07T08:09:20.5395148Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2025-09-07T08:09:20.5395893Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2025-09-07T08:09:20.5396602Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2025-09-07T08:09:20.5397296Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2025-09-07T08:09:20.5413783Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2025-09-07T08:09:20.5488102Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2025-09-07T08:09:20.5488860Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-09-07T08:09:20.5489554Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2025-09-07T08:09:20.5490179Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2025-09-07T08:09:20.5490751Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2025-09-07T08:09:20.5491334Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2025-09-07T08:09:20.5491951Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/InstallScripts.json 2025-09-07T08:09:20.5492752Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2025-09-07T08:09:20.5494538Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2025-09-07T08:09:20.5495325Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2025-09-07T08:09:20.5496008Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2025-09-07T08:09:20.5652678Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2025-09-07T08:09:20.5703384Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2025-09-07T08:09:20.5703840Z creating: build/custom_test_artifacts/jit-hook-build/ 2025-09-07T08:09:20.5704248Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2025-09-07T08:09:20.5704751Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2025-09-07T08:09:20.5711550Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2025-09-07T08:09:20.5712143Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/ 2025-09-07T08:09:20.5712693Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-09-07T08:09:20.5713284Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-09-07T08:09:20.5713857Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-09-07T08:09:20.5715553Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-09-07T08:09:20.5716823Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-09-07T08:09:20.5717730Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-09-07T08:09:20.5718496Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-09-07T08:09:20.5719204Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-09-07T08:09:20.5720731Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-09-07T08:09:20.5722045Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-09-07T08:09:20.5722860Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-09-07T08:09:20.5724454Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-09-07T08:09:20.5725634Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-09-07T08:09:20.5726242Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/ 2025-09-07T08:09:20.5726762Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/ 2025-09-07T08:09:20.5767277Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-09-07T08:09:20.5807044Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-09-07T08:09:20.5807969Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-09-07T08:09:20.5852995Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-09-07T08:09:20.5853881Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-09-07T08:09:20.5854771Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-09-07T08:09:20.5855872Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-09-07T08:09:20.5857047Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-09-07T08:09:20.5857937Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-09-07T08:09:20.5858823Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-09-07T08:09:20.5859769Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-09-07T08:09:20.5860567Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-09-07T08:09:20.5861223Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-09-07T08:09:20.5861955Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-09-07T08:09:20.5862587Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-09-07T08:09:20.5863208Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-09-07T08:09:20.5863836Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.o 2025-09-07T08:09:20.5864453Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-09-07T08:09:20.5930902Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/a.out 2025-09-07T08:09:20.5931579Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeCUDACompiler.cmake 2025-09-07T08:09:20.5998098Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CUDA.bin 2025-09-07T08:09:20.5998691Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2025-09-07T08:09:20.5999221Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2025-09-07T08:09:20.5999918Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2025-09-07T08:09:20.6000637Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2025-09-07T08:09:20.6001326Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2025-09-07T08:09:20.6002278Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2025-09-07T08:09:20.6003006Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2025-09-07T08:09:20.6003672Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2025-09-07T08:09:20.6004355Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2025-09-07T08:09:20.6005222Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2025-09-07T08:09:20.6005920Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2025-09-07T08:09:20.6006610Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2025-09-07T08:09:20.6027362Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2025-09-07T08:09:20.6028312Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2025-09-07T08:09:20.6080908Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2025-09-07T08:09:20.6081578Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-09-07T08:09:20.6082448Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2025-09-07T08:09:20.6082995Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2025-09-07T08:09:20.6083488Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2025-09-07T08:09:20.6084054Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2025-09-07T08:09:20.6084563Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/InstallScripts.json 2025-09-07T08:09:20.6085194Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2025-09-07T08:09:20.6087439Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2025-09-07T08:09:20.6088081Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2025-09-07T08:09:20.6089204Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2025-09-07T08:09:20.6123282Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2025-09-07T08:09:20.6123853Z creating: build/custom_test_artifacts/custom-backend-build/ 2025-09-07T08:09:20.6124339Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2025-09-07T08:09:20.6125096Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2025-09-07T08:09:20.6131231Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2025-09-07T08:09:20.6131793Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/ 2025-09-07T08:09:20.6132357Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-09-07T08:09:20.6132931Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-09-07T08:09:20.6133499Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-09-07T08:09:20.6134895Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-09-07T08:09:20.6136601Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-09-07T08:09:20.6137238Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-09-07T08:09:20.6137833Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-09-07T08:09:20.6138400Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-09-07T08:09:20.6140400Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-09-07T08:09:20.6141233Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-09-07T08:09:20.6142001Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-09-07T08:09:20.6143430Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-09-07T08:09:20.6144588Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-09-07T08:09:20.6145404Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/ 2025-09-07T08:09:20.6145983Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/ 2025-09-07T08:09:20.6185587Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-09-07T08:09:20.6224854Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-09-07T08:09:20.6225995Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-09-07T08:09:20.6271012Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-09-07T08:09:20.6271975Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-09-07T08:09:20.6272939Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-09-07T08:09:20.6273933Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-09-07T08:09:20.6274896Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-09-07T08:09:20.6275978Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-09-07T08:09:20.6276936Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-09-07T08:09:20.6277865Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-09-07T08:09:20.6278773Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-09-07T08:09:20.6279642Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-09-07T08:09:20.6280459Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-09-07T08:09:20.6281166Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-09-07T08:09:20.6281871Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-09-07T08:09:20.6282574Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.o 2025-09-07T08:09:20.6283304Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-09-07T08:09:20.6347984Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/a.out 2025-09-07T08:09:20.6348728Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeCUDACompiler.cmake 2025-09-07T08:09:20.6415357Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CUDA.bin 2025-09-07T08:09:20.6416331Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2025-09-07T08:09:20.6416927Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2025-09-07T08:09:20.6417536Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2025-09-07T08:09:20.6418177Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2025-09-07T08:09:20.6418897Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2025-09-07T08:09:20.6419688Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2025-09-07T08:09:20.6420356Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2025-09-07T08:09:20.6420989Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2025-09-07T08:09:20.6421720Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2025-09-07T08:09:20.6422383Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2025-09-07T08:09:20.6423030Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2025-09-07T08:09:20.6423832Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2025-09-07T08:09:20.6424474Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2025-09-07T08:09:20.6425489Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2025-09-07T08:09:20.6533569Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2025-09-07T08:09:20.6534335Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2025-09-07T08:09:20.6535254Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2025-09-07T08:09:20.6536128Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2025-09-07T08:09:20.6536967Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2025-09-07T08:09:20.6537742Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2025-09-07T08:09:20.6538542Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2025-09-07T08:09:20.6539327Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2025-09-07T08:09:20.6539965Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2025-09-07T08:09:20.6540602Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2025-09-07T08:09:20.6541232Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2025-09-07T08:09:20.6558513Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2025-09-07T08:09:20.6607348Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2025-09-07T08:09:20.6608246Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-09-07T08:09:20.6608994Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2025-09-07T08:09:20.6609681Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2025-09-07T08:09:20.6610680Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2025-09-07T08:09:20.6611296Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2025-09-07T08:09:20.6611934Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/InstallScripts.json 2025-09-07T08:09:20.6612554Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2025-09-07T08:09:20.6613646Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2025-09-07T08:09:20.6614489Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2025-09-07T08:09:20.6615256Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2025-09-07T08:09:20.6704730Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2025-09-07T08:09:20.6740046Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2025-09-07T08:09:20.6740493Z creating: build/lib/ 2025-09-07T08:09:20.6819832Z inflating: build/lib/libprotobuf-lite.a 2025-09-07T08:09:20.7223784Z inflating: build/lib/libprotobuf.a 2025-09-07T08:09:20.7232691Z inflating: build/lib/libpthreadpool.a 2025-09-07T08:09:20.7239939Z inflating: build/lib/libcpuinfo.a 2025-09-07T08:09:20.7689767Z inflating: build/lib/libprotoc.a 2025-09-07T08:09:20.7696638Z inflating: build/lib/libcpuinfo_internals.a 2025-09-07T08:09:20.7697341Z inflating: build/lib/libclog.a 2025-09-07T08:09:20.7699269Z inflating: build/lib/libnnpack_reference_layers.a 2025-09-07T08:09:20.7716084Z inflating: build/lib/libpytorch_qnnpack.a 2025-09-07T08:09:20.7877382Z inflating: build/lib/libmicrokernels-prod.a 2025-09-07T08:09:20.7893655Z inflating: build/lib/libnnpack.a 2025-09-07T08:09:20.8596198Z inflating: build/lib/libmicrokernels-all.a 2025-09-07T08:09:20.8657074Z inflating: build/lib/libgtest.a 2025-09-07T08:09:20.8672732Z inflating: build/lib/libgmock.a 2025-09-07T08:09:20.8673478Z inflating: build/lib/libgmock_main.a 2025-09-07T08:09:20.8673961Z inflating: build/lib/libgtest_main.a 2025-09-07T08:09:20.8742910Z inflating: build/lib/libbenchmark.a 2025-09-07T08:09:20.8743516Z inflating: build/lib/libbenchmark_main.a 2025-09-07T08:09:20.8825407Z inflating: build/lib/libXNNPACK.a 2025-09-07T08:09:20.8825905Z inflating: build/lib/libjitprofiling.a 2025-09-07T08:09:20.8832872Z inflating: build/lib/libittnotify.a 2025-09-07T08:09:20.8892047Z inflating: build/lib/libasmjit.a 2025-09-07T08:09:21.0155520Z inflating: build/lib/libfbgemm.a 2025-09-07T08:09:21.0183085Z inflating: build/lib/libtensorpipe_uv.a 2025-09-07T08:09:21.0690319Z inflating: build/lib/libtensorpipe.a 2025-09-07T08:09:21.0919579Z inflating: build/lib/libtensorpipe_cuda.a 2025-09-07T08:09:21.1038643Z inflating: build/lib/libgloo.a 2025-09-07T08:09:21.1082928Z inflating: build/lib/libonnx_proto.a 2025-09-07T08:09:21.1739402Z inflating: build/lib/libonnx.a 2025-09-07T08:09:21.2143044Z inflating: build/lib/libgloo_cuda.a 2025-09-07T08:09:21.2160505Z inflating: build/lib/libfmt.a 2025-09-07T08:09:22.1601189Z inflating: build/lib/libdnnl.a 2025-09-07T08:09:22.2022408Z inflating: build/lib/libkineto.a 2025-09-07T08:09:22.2125432Z inflating: build/lib/libc10.so 2025-09-07T08:09:22.2126783Z inflating: build/lib/libtorch_global_deps.so 2025-09-07T08:09:22.2128301Z inflating: build/lib/libcaffe2_nvrtc.so 2025-09-07T08:09:22.2183563Z inflating: build/lib/libc10_cuda.so 2025-09-07T08:09:25.1128309Z inflating: build/lib/libtorch_cpu.so 2025-09-07T08:09:25.1802435Z inflating: build/lib/libtorch_nvshmem.so 2025-09-07T08:09:27.0276881Z inflating: build/lib/libtorch_cuda.so 2025-09-07T08:09:27.0278036Z inflating: build/lib/libtorch.so 2025-09-07T08:09:27.0323310Z inflating: build/lib/libtorch_cuda_linalg.so 2025-09-07T08:09:27.0386455Z inflating: build/lib/libtorchbind_test.so 2025-09-07T08:09:27.0403672Z inflating: build/lib/libjitbackend_test.so 2025-09-07T08:09:27.0425589Z inflating: build/lib/libbackend_with_compiler.so 2025-09-07T08:09:27.0449772Z inflating: build/lib/libaoti_custom_ops.so 2025-09-07T08:09:27.0452059Z inflating: build/lib/libc10d_cuda_test.so 2025-09-07T08:09:27.0455773Z inflating: build/lib/libshm.so 2025-09-07T08:09:27.2461878Z inflating: build/lib/libtorch_python.so 2025-09-07T08:09:27.2493602Z inflating: build/lib/libnnapi_backend.so 2025-09-07T08:09:27.2494124Z creating: build/bin/ 2025-09-07T08:09:27.2894530Z inflating: build/bin/protoc-3.13.0.0 2025-09-07T08:09:27.3293599Z inflating: build/bin/protoc 2025-09-07T08:09:27.3343847Z inflating: build/bin/c10_AllocatorConfig_test 2025-09-07T08:09:27.3392551Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2025-09-07T08:09:27.3442283Z inflating: build/bin/c10_Device_test 2025-09-07T08:09:27.3488713Z inflating: build/bin/c10_StreamGuard_test 2025-09-07T08:09:27.3543550Z inflating: build/bin/c10_SymInt_test 2025-09-07T08:09:27.3592783Z inflating: build/bin/c10_DeviceGuard_test 2025-09-07T08:09:27.3649440Z inflating: build/bin/c10_DispatchKeySet_test 2025-09-07T08:09:27.3702582Z inflating: build/bin/c10_SizesAndStrides_test 2025-09-07T08:09:27.3753652Z inflating: build/bin/c10_InlineDeviceGuard_test 2025-09-07T08:09:27.3820036Z inflating: build/bin/c10_cow_test 2025-09-07T08:09:27.3871150Z inflating: build/bin/c10_Scalar_test 2025-09-07T08:09:27.3923893Z inflating: build/bin/c10_InlineStreamGuard_test 2025-09-07T08:09:27.3974844Z inflating: build/bin/c10_Bitset_test 2025-09-07T08:09:27.4022797Z inflating: build/bin/c10_ArrayRef_test 2025-09-07T08:09:27.4077490Z inflating: build/bin/c10_Enumerate_test 2025-09-07T08:09:27.4124174Z inflating: build/bin/c10_ConstexprCrc_test 2025-09-07T08:09:27.4172673Z inflating: build/bin/c10_DeadlockDetection_test 2025-09-07T08:09:27.4221053Z inflating: build/bin/c10_Half_test 2025-09-07T08:09:27.4274827Z inflating: build/bin/c10_LeftRight_test 2025-09-07T08:09:27.4325459Z inflating: build/bin/c10_IntrusiveList_test 2025-09-07T08:09:27.4377415Z inflating: build/bin/c10_Metaprogramming_test 2025-09-07T08:09:27.4428416Z inflating: build/bin/c10_NetworkFlow_test 2025-09-07T08:09:27.4475592Z inflating: build/bin/c10_Semaphore_test 2025-09-07T08:09:27.4523889Z inflating: build/bin/c10_Synchronized_test 2025-09-07T08:09:27.4577100Z inflating: build/bin/c10_ThreadLocal_test 2025-09-07T08:09:27.4626847Z inflating: build/bin/c10_TypeIndex_test 2025-09-07T08:09:27.4675550Z inflating: build/bin/c10_TypeList_test 2025-09-07T08:09:27.4725309Z inflating: build/bin/c10_accumulate_test 2025-09-07T08:09:27.4778907Z inflating: build/bin/c10_bfloat16_test 2025-09-07T08:09:27.4826108Z inflating: build/bin/c10_TypeTraits_test 2025-09-07T08:09:27.4874277Z inflating: build/bin/c10_bit_cast_test 2025-09-07T08:09:27.4928495Z inflating: build/bin/c10_complex_math_test 2025-09-07T08:09:27.4978544Z inflating: build/bin/c10_exception_test 2025-09-07T08:09:27.5026677Z inflating: build/bin/c10_generic_math_test 2025-09-07T08:09:27.5078680Z inflating: build/bin/c10_complex_test 2025-09-07T08:09:27.5127488Z inflating: build/bin/c10_irange_test 2025-09-07T08:09:27.5181984Z inflating: build/bin/c10_logging_test 2025-09-07T08:09:27.5232540Z inflating: build/bin/c10_lazy_test 2025-09-07T08:09:27.5280936Z inflating: build/bin/c10_flags_test 2025-09-07T08:09:27.5434659Z inflating: build/bin/c10_intrusive_ptr_test 2025-09-07T08:09:27.5482638Z inflating: build/bin/c10_error_test 2025-09-07T08:09:27.5526443Z inflating: build/bin/c10_intrusive_ptr_benchmark 2025-09-07T08:09:27.5576922Z inflating: build/bin/c10_registry_test 2025-09-07T08:09:27.5625688Z inflating: build/bin/c10_tempfile_test 2025-09-07T08:09:27.5678787Z inflating: build/bin/c10_string_util_test 2025-09-07T08:09:27.5822754Z inflating: build/bin/c10_small_vector_test 2025-09-07T08:09:27.5869305Z inflating: build/bin/c10_string_view_test 2025-09-07T08:09:27.5919913Z inflating: build/bin/c10_ssize_test 2025-09-07T08:09:27.5979383Z inflating: build/bin/c10_ordered_preserving_dict_test 2025-09-07T08:09:27.6051211Z inflating: build/bin/c10_optional_test 2025-09-07T08:09:27.6103817Z inflating: build/bin/c10_typeid_test 2025-09-07T08:09:27.6151443Z inflating: build/bin/c10_cuda_CUDATest 2025-09-07T08:09:27.6692765Z inflating: build/bin/vec_test_all_types_DEFAULT 2025-09-07T08:09:27.7252727Z inflating: build/bin/vec_test_all_types_AVX2 2025-09-07T08:09:27.7805510Z inflating: build/bin/vec_test_all_types_AVX512 2025-09-07T08:09:27.7855641Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_stream 2025-09-07T08:09:27.7905788Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_1_var_test 2025-09-07T08:09:27.7956217Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_blocks_and_threads 2025-09-07T08:09:27.8005698Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_same_block 2025-09-07T08:09:27.8055423Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_from_2_processes 2025-09-07T08:09:27.8106265Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_thread_and_block_and_device 2025-09-07T08:09:27.8156653Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_multiple_blocks 2025-09-07T08:09:27.8207033Z inflating: build/bin/BackoffTest 2025-09-07T08:09:27.8260067Z inflating: build/bin/TCPStoreTest 2025-09-07T08:09:27.8310682Z inflating: build/bin/HashStoreTest 2025-09-07T08:09:27.8361802Z inflating: build/bin/FileStoreTest 2025-09-07T08:09:27.8374567Z inflating: build/bin/ProcessGroupMPITest 2025-09-07T08:09:27.8377325Z inflating: build/bin/example_allreduce 2025-09-07T08:09:27.8447138Z inflating: build/bin/Dict_test 2025-09-07T08:09:27.8497084Z inflating: build/bin/Dimname_test 2025-09-07T08:09:27.8551961Z inflating: build/bin/NamedTensor_test 2025-09-07T08:09:27.8614135Z inflating: build/bin/MaybeOwned_test 2025-09-07T08:09:27.8670237Z inflating: build/bin/atest 2025-09-07T08:09:27.8731252Z inflating: build/bin/basic 2025-09-07T08:09:27.8788031Z inflating: build/bin/apply_utils_test 2025-09-07T08:09:27.8840208Z inflating: build/bin/broadcast_test 2025-09-07T08:09:27.8889531Z inflating: build/bin/cpu_allocator_test 2025-09-07T08:09:27.8944734Z inflating: build/bin/cpu_generator_test 2025-09-07T08:09:27.8995286Z inflating: build/bin/cpu_profiling_allocator_test 2025-09-07T08:09:27.9082510Z inflating: build/bin/cpu_rng_test 2025-09-07T08:09:27.9131336Z inflating: build/bin/dlconvertor_test 2025-09-07T08:09:27.9186574Z inflating: build/bin/extension_backend_test 2025-09-07T08:09:27.9239003Z inflating: build/bin/half_test 2025-09-07T08:09:27.9286868Z inflating: build/bin/lazy_tensor_test 2025-09-07T08:09:27.9337870Z inflating: build/bin/memory_format_test 2025-09-07T08:09:27.9389224Z inflating: build/bin/math_kernel_test 2025-09-07T08:09:27.9480662Z inflating: build/bin/ivalue_test 2025-09-07T08:09:27.9531302Z inflating: build/bin/memory_overlapping_test 2025-09-07T08:09:27.9581818Z inflating: build/bin/mobile_memory_cleanup 2025-09-07T08:09:27.9635690Z inflating: build/bin/native_test 2025-09-07T08:09:27.9685188Z inflating: build/bin/operator_name_test 2025-09-07T08:09:27.9734068Z inflating: build/bin/operators_test 2025-09-07T08:09:27.9783289Z inflating: build/bin/packedtensoraccessor_test 2025-09-07T08:09:27.9846367Z inflating: build/bin/pow_test 2025-09-07T08:09:27.9900677Z inflating: build/bin/quantized_test 2025-09-07T08:09:27.9948797Z inflating: build/bin/reduce_ops_test 2025-09-07T08:09:27.9997692Z inflating: build/bin/reportMemoryUsage_test 2025-09-07T08:09:28.0052018Z inflating: build/bin/scalar_tensor_test 2025-09-07T08:09:28.0108504Z inflating: build/bin/scalar_test 2025-09-07T08:09:28.0158199Z inflating: build/bin/StorageUtils_test 2025-09-07T08:09:28.0208686Z inflating: build/bin/stride_properties_test 2025-09-07T08:09:28.0284329Z inflating: build/bin/tensor_iterator_test 2025-09-07T08:09:28.0336278Z inflating: build/bin/test_parallel 2025-09-07T08:09:28.0384095Z inflating: build/bin/thread_init_test 2025-09-07T08:09:28.0436687Z inflating: build/bin/type_ptr_test 2025-09-07T08:09:28.0492650Z inflating: build/bin/type_test 2025-09-07T08:09:28.0542228Z inflating: build/bin/undefined_tensor_test 2025-09-07T08:09:28.0589490Z inflating: build/bin/verify_api_visibility 2025-09-07T08:09:28.0655583Z inflating: build/bin/legacy_vmap_test 2025-09-07T08:09:28.0703913Z inflating: build/bin/weakref_test 2025-09-07T08:09:28.0753223Z inflating: build/bin/xla_tensor_test 2025-09-07T08:09:28.0802255Z inflating: build/bin/wrapdim_test 2025-09-07T08:09:28.0858951Z inflating: build/bin/IListRef_test 2025-09-07T08:09:28.0972420Z inflating: build/bin/kernel_function_legacy_test 2025-09-07T08:09:28.1071614Z inflating: build/bin/List_test 2025-09-07T08:09:28.1135343Z inflating: build/bin/KernelFunction_test 2025-09-07T08:09:28.1225256Z inflating: build/bin/kernel_function_test 2025-09-07T08:09:28.1344014Z inflating: build/bin/kernel_lambda_legacy_test 2025-09-07T08:09:28.1440616Z inflating: build/bin/kernel_lambda_test 2025-09-07T08:09:28.1499187Z inflating: build/bin/kernel_stackbased_test 2025-09-07T08:09:28.1588699Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2025-09-07T08:09:28.1637538Z inflating: build/bin/CppSignature_test 2025-09-07T08:09:28.1690243Z inflating: build/bin/backend_fallback_test 2025-09-07T08:09:28.1736506Z inflating: build/bin/op_allowlist_test 2025-09-07T08:09:28.2013035Z inflating: build/bin/op_registration_test 2025-09-07T08:09:28.2075869Z inflating: build/bin/inline_container_test 2025-09-07T08:09:28.2125908Z inflating: build/bin/cuda_allocator_test 2025-09-07T08:09:28.2176058Z inflating: build/bin/cuda_apply_test 2025-09-07T08:09:28.2231992Z inflating: build/bin/cuda_atomic_ops_test 2025-09-07T08:09:28.2286030Z inflating: build/bin/cuda_caching_host_allocator_test 2025-09-07T08:09:28.2352001Z inflating: build/bin/cuda_complex_math_test 2025-09-07T08:09:28.2407937Z inflating: build/bin/cuda_complex_test 2025-09-07T08:09:28.2466638Z inflating: build/bin/cuda_cub_test 2025-09-07T08:09:28.2514152Z inflating: build/bin/cuda_device_test 2025-09-07T08:09:28.2576554Z inflating: build/bin/cuda_distributions_test 2025-09-07T08:09:28.2626788Z inflating: build/bin/cuda_dlconvertor_test 2025-09-07T08:09:28.2674679Z inflating: build/bin/cuda_exchange_device_test 2025-09-07T08:09:28.2722284Z inflating: build/bin/cuda_half_test 2025-09-07T08:09:28.2776125Z inflating: build/bin/cuda_generator_test 2025-09-07T08:09:28.2825323Z inflating: build/bin/cuda_integer_divider_test 2025-09-07T08:09:28.2872979Z inflating: build/bin/cuda_optional_test 2025-09-07T08:09:28.2922854Z inflating: build/bin/cuda_packedtensoraccessor_test 2025-09-07T08:09:28.2972291Z inflating: build/bin/cuda_reportMemoryUsage_test 2025-09-07T08:09:28.3019725Z inflating: build/bin/cuda_allocatorTraceTracker_test 2025-09-07T08:09:28.3077011Z inflating: build/bin/cuda_stream_test 2025-09-07T08:09:28.3124471Z inflating: build/bin/cuda_cudnn_test 2025-09-07T08:09:28.3174567Z inflating: build/bin/cuda_vectorized_test 2025-09-07T08:09:28.3518677Z inflating: build/bin/test_nativert 2025-09-07T08:09:28.3571425Z inflating: build/bin/test_dist_autograd 2025-09-07T08:09:28.3635761Z inflating: build/bin/test_cpp_rpc 2025-09-07T08:09:28.4712145Z inflating: build/bin/test_api 2025-09-07T08:09:28.4714069Z inflating: build/bin/parallel_benchmark 2025-09-07T08:09:28.4776637Z inflating: build/bin/ProcessGroupGlooTest 2025-09-07T08:09:28.4837818Z inflating: build/bin/ProcessGroupNCCLTest 2025-09-07T08:09:28.4892565Z inflating: build/bin/ProcessGroupGlooAsyncTest 2025-09-07T08:09:28.4951053Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2025-09-07T08:09:28.5950919Z inflating: build/bin/test_jit 2025-09-07T08:09:28.6272637Z inflating: build/bin/test_lazy 2025-09-07T08:09:28.6276292Z inflating: build/bin/torch_shm_manager 2025-09-07T08:09:28.6276636Z creating: .additional_ci_files/ 2025-09-07T08:09:28.6356730Z inflating: .additional_ci_files/test-times.json 2025-09-07T08:09:28.6660967Z inflating: .additional_ci_files/test-class-times.json 2025-09-07T08:09:28.7154168Z ##[group]Run rm artifacts.zip 2025-09-07T08:09:28.7154449Z rm artifacts.zip 2025-09-07T08:09:28.7169223Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:09:28.7169517Z env: 2025-09-07T08:09:28.7169683Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:28.7169936Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:28.7170297Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:28.7170590Z ##[endgroup] 2025-09-07T08:09:29.3215762Z ##[group]Run df -H 2025-09-07T08:09:29.3215983Z df -H 2025-09-07T08:09:29.3230650Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:09:29.3230928Z env: 2025-09-07T08:09:29.3231093Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:29.3231570Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:29.3231905Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:29.3232179Z ##[endgroup] 2025-09-07T08:09:29.3653425Z Filesystem Size Used Avail Use% Mounted on 2025-09-07T08:09:29.3653709Z overlay 7.3T 633G 6.7T 9% / 2025-09-07T08:09:29.3653952Z tmpfs 68M 0 68M 0% /dev 2025-09-07T08:09:29.3654194Z shm 68M 0 68M 0% /dev/shm 2025-09-07T08:09:29.3654453Z /dev/root 7.3T 633G 6.7T 9% /home/david/_work 2025-09-07T08:09:29.3654759Z tmpfs 215G 111k 215G 1% /run/docker.sock 2025-09-07T08:09:29.3655213Z tmpfs 1.1T 13k 1.1T 1% /proc/driver/nvidia 2025-09-07T08:09:29.3655563Z tmpfs 430G 2.9M 430G 1% /run/.ro2533259278/nvidia-persistenced/socket 2025-09-07T08:09:29.3655908Z tmpfs 1.1T 0 1.1T 0% /proc/acpi 2025-09-07T08:09:29.3656161Z tmpfs 1.1T 0 1.1T 0% /proc/scsi 2025-09-07T08:09:29.3656662Z tmpfs 1.1T 0 1.1T 0% /sys/firmware 2025-09-07T08:09:29.3683134Z Prepare all required actions 2025-09-07T08:09:29.3684017Z Getting action download info 2025-09-07T08:09:29.5557657Z ##[group]Run ./.github/actions/download-td-artifacts 2025-09-07T08:09:29.5557977Z with: 2025-09-07T08:09:29.5558153Z env: 2025-09-07T08:09:29.5558337Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:29.5558626Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:29.5559034Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:29.5559370Z ##[endgroup] 2025-09-07T08:09:29.8343318Z ##[group]Run seemethere/download-artifact-s3@v4 2025-09-07T08:09:29.8343570Z with: 2025-09-07T08:09:29.8343728Z name: td_results 2025-09-07T08:09:29.8343914Z s3-bucket: gha-artifacts 2025-09-07T08:09:29.8344110Z region: us-east-1 2025-09-07T08:09:29.8344269Z env: 2025-09-07T08:09:29.8344431Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:29.8344680Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:29.8345324Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:29.8345631Z ##[endgroup] 2025-09-07T08:09:30.2427147Z (node:9129) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-09-07T08:09:30.2427584Z 2025-09-07T08:09:30.2427765Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-09-07T08:09:30.2428238Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-09-07T08:09:30.2428756Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-09-07T08:09:30.3592986Z Found 0 objects with prefix pytorch/pytorch/17525296438/td_results/ 2025-09-07T08:09:30.3598969Z Artifact download has finished successfully 2025-09-07T08:09:30.5058500Z ##[group]Run mkdir -p .additional_ci_files 2025-09-07T08:09:30.5058809Z mkdir -p .additional_ci_files 2025-09-07T08:09:30.5059129Z mv td_results.json .additional_ci_files/td_results.json || true 2025-09-07T08:09:30.5073897Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:09:30.5074189Z env: 2025-09-07T08:09:30.5074346Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:30.5074612Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:30.5075102Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:30.5075385Z ##[endgroup] 2025-09-07T08:09:30.7385712Z mv: cannot stat 'td_results.json': No such file or directory 2025-09-07T08:09:30.9644188Z ##[group]Run .github/scripts/parse_ref.py 2025-09-07T08:09:30.9644515Z .github/scripts/parse_ref.py 2025-09-07T08:09:30.9659230Z shell: /usr/bin/bash -e {0} 2025-09-07T08:09:30.9659450Z env: 2025-09-07T08:09:30.9659614Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:30.9659870Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:30.9660206Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:30.9660488Z ##[endgroup] 2025-09-07T08:09:31.2073106Z Setting output branch=main 2025-09-07T08:09:31.2169837Z Prepare all required actions 2025-09-07T08:09:31.2170173Z Getting action download info 2025-09-07T08:09:31.4237647Z ##[group]Run ./.github/actions/filter-test-configs 2025-09-07T08:09:31.4238015Z with: 2025-09-07T08:09:31.4238481Z github-token: *** 2025-09-07T08:09:31.4244188Z test-matrix: {"include": [{"config": "inductor_huggingface_perf_cuda_h100", "shard": 1, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 2, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 3, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 4, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 5, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 1, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 2, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 3, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 4, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 5, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 6, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 7, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 1, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 2, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 3, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 4, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 5, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 6, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 7, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 8, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 9, "num_shards": 9, "runner": "linux.aws.h100"}]} 2025-09-07T08:09:31.4249305Z job-name: test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T08:09:31.4249630Z env: 2025-09-07T08:09:31.4249801Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:31.4250049Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:31.4250371Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:31.4250645Z ##[endgroup] 2025-09-07T08:09:31.4726781Z ##[group]Run nick-fields/retry@v3.0.0 2025-09-07T08:09:31.4727040Z with: 2025-09-07T08:09:31.4727195Z shell: bash 2025-09-07T08:09:31.4727370Z timeout_minutes: 10 2025-09-07T08:09:31.4727553Z max_attempts: 5 2025-09-07T08:09:31.4727735Z retry_wait_seconds: 30 2025-09-07T08:09:31.4728337Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-09-07T08:09:31.4728968Z polling_interval_seconds: 1 2025-09-07T08:09:31.4729184Z warning_on_retry: true 2025-09-07T08:09:31.4729392Z continue_on_error: false 2025-09-07T08:09:31.4729588Z env: 2025-09-07T08:09:31.4729740Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:31.4729984Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:31.4730332Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:31.4730772Z GITHUB_TOKEN: *** 2025-09-07T08:09:31.4730948Z ##[endgroup] 2025-09-07T08:09:31.5447769Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-09-07T08:09:31.8152674Z Defaulting to user installation because normal site-packages is not writeable 2025-09-07T08:09:32.4507112Z Collecting requests==2.27.1 2025-09-07T08:09:32.5077788Z Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB) 2025-09-07T08:09:33.2195693Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.1/63.1 KB 74.6 kB/s eta 0:00:00 2025-09-07T08:09:34.0593771Z Collecting pyyaml==6.0.2 2025-09-07T08:09:34.0702548Z Downloading PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (751 kB) 2025-09-07T08:09:34.6492069Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 751.2/751.2 KB 1.3 MB/s eta 0:00:00 2025-09-07T08:09:34.6618548Z Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests==2.27.1) (3.3) 2025-09-07T08:09:35.4418800Z Collecting charset-normalizer~=2.0.0 2025-09-07T08:09:35.4524034Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2025-09-07T08:09:35.9534132Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3/dist-packages (from requests==2.27.1) (1.26.5) 2025-09-07T08:09:35.9539594Z Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests==2.27.1) (2020.6.20) 2025-09-07T08:09:36.0247930Z Installing collected packages: pyyaml, charset-normalizer, requests 2025-09-07T08:09:36.6482921Z WARNING: The script normalizer is installed in '/home/david/.local/bin' which is not on PATH. 2025-09-07T08:09:36.6483718Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-09-07T08:09:38.2670660Z Successfully installed charset-normalizer-2.0.12 pyyaml-6.0.2 requests-2.27.1 2025-09-07T08:09:38.5484615Z Command completed after 1 attempt(s). 2025-09-07T08:09:38.5920032Z ##[group]Run set -x 2025-09-07T08:09:38.5920268Z set -x 2025-09-07T08:09:38.5920465Z  2025-09-07T08:09:38.5920806Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-09-07T08:09:38.5921235Z # in runner workspace 2025-09-07T08:09:38.5921583Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2025-09-07T08:09:38.5936087Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:09:38.5936388Z env: 2025-09-07T08:09:38.5936552Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:38.5936802Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:38.5937131Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:38.5937412Z ##[endgroup] 2025-09-07T08:09:38.7471294Z + python3 /home/david/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2025-09-07T08:09:38.7612340Z Setting output branch=main 2025-09-07T08:09:38.7833931Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2025-09-07T08:09:38.7834276Z echo "Workflow: ${GITHUB_WORKFLOW}" 2025-09-07T08:09:38.7834536Z echo "Job name: ${JOB_NAME}" 2025-09-07T08:09:38.7834789Z  2025-09-07T08:09:38.7835321Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-09-07T08:09:38.7835742Z # in runner workspace 2025-09-07T08:09:38.7836118Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2025-09-07T08:09:38.7836545Z  --workflow "${GITHUB_WORKFLOW}" \ 2025-09-07T08:09:38.7836841Z  --job-name "${JOB_NAME}" \ 2025-09-07T08:09:38.7843056Z  --test-matrix "{"include": [{"config": "inductor_huggingface_perf_cuda_h100", "shard": 1, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 2, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 3, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 4, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 5, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 1, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 2, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 3, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 4, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 5, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 6, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 7, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 1, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 2, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 3, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 4, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 5, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 6, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 7, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 8, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 9, "num_shards": 9, "runner": "linux.aws.h100"}]}" \ 2025-09-07T08:09:38.7848649Z  --selected-test-configs "" \ 2025-09-07T08:09:38.7848900Z  --pr-number "${PR_NUMBER}" \ 2025-09-07T08:09:38.7849140Z  --tag "${TAG}" \ 2025-09-07T08:09:38.7849358Z  --event-name "${EVENT_NAME}" \ 2025-09-07T08:09:38.7849592Z  --schedule "${SCHEDULE}" \ 2025-09-07T08:09:38.7849820Z  --branch "${HEAD_BRANCH}" 2025-09-07T08:09:38.7863191Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:09:38.7863484Z env: 2025-09-07T08:09:38.7863649Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:38.7863898Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:38.7864244Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:38.7864701Z GITHUB_TOKEN: *** 2025-09-07T08:09:38.7865154Z JOB_NAME: test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T08:09:38.7865497Z PR_NUMBER: 2025-09-07T08:09:38.7865654Z TAG: 2025-09-07T08:09:38.7865808Z EVENT_NAME: schedule 2025-09-07T08:09:38.7865987Z SCHEDULE: 0 7 * * 0 2025-09-07T08:09:38.7866163Z HEAD_BRANCH: main 2025-09-07T08:09:38.7866333Z ##[endgroup] 2025-09-07T08:09:38.8202213Z Workflow: inductor-perf-nightly-h100 2025-09-07T08:09:38.8202652Z Job name: test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T08:09:39.0382200Z Setting output keep-going=True 2025-09-07T08:09:39.0382574Z Setting output ci-verbose-test-logs=False 2025-09-07T08:09:39.0382913Z Setting output ci-test-showlocals=False 2025-09-07T08:09:39.0383223Z Setting output ci-no-test-timeout=False 2025-09-07T08:09:39.0383506Z Setting output ci-no-td=False 2025-09-07T08:09:39.0383782Z Setting output ci-td-distributed=False 2025-09-07T08:09:39.0384069Z Setting output is-unstable=False 2025-09-07T08:09:39.0384343Z Setting output reenabled-issues= 2025-09-07T08:09:39.0391041Z Setting output test-matrix={"include": [{"config": "inductor_huggingface_perf_cuda_h100", "shard": 1, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 2, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 3, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 4, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 5, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 1, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 2, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 3, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 4, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 5, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 6, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 7, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 1, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 2, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 3, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 4, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 5, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 6, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 7, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 8, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 9, "num_shards": 9, "runner": "linux.aws.h100"}]} 2025-09-07T08:09:39.0397212Z Setting output is-test-matrix-empty=False 2025-09-07T08:09:39.0759203Z ##[group]Run echo "Filtered matrix:" 2025-09-07T08:09:39.0759519Z echo "Filtered matrix:" 2025-09-07T08:09:39.0765736Z echo "{"include": [{"config": "inductor_huggingface_perf_cuda_h100", "shard": 1, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 2, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 3, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 4, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_huggingface_perf_cuda_h100", "shard": 5, "num_shards": 5, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 1, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 2, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 3, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 4, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 5, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 6, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_timm_perf_cuda_h100", "shard": 7, "num_shards": 7, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 1, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 2, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 3, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 4, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 5, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 6, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 7, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 8, "num_shards": 9, "runner": "linux.aws.h100"}, {"config": "inductor_torchbench_perf_cuda_h100", "shard": 9, "num_shards": 9, "runner": "linux.aws.h100"}]}" 2025-09-07T08:09:39.0770764Z  2025-09-07T08:09:39.0770926Z echo 2025-09-07T08:09:39.0771131Z echo "Is the current job unstable? False" 2025-09-07T08:09:39.0771372Z  2025-09-07T08:09:39.0771534Z echo 2025-09-07T08:09:39.0771729Z echo "Is keep-going label set? True" 2025-09-07T08:09:39.0771968Z  2025-09-07T08:09:39.0772111Z echo 2025-09-07T08:09:39.0772281Z echo "Reenabled issues? " 2025-09-07T08:09:39.0786162Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:09:39.0786457Z env: 2025-09-07T08:09:39.0786614Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:39.0786867Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:39.0787202Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:39.0787495Z ##[endgroup] 2025-09-07T08:09:39.1471031Z Filtered matrix: 2025-09-07T08:09:39.1477643Z {include: [{config: inductor_huggingface_perf_cuda_h100, shard: 1, num_shards: 5, runner: linux.aws.h100}, {config: inductor_huggingface_perf_cuda_h100, shard: 2, num_shards: 5, runner: linux.aws.h100}, {config: inductor_huggingface_perf_cuda_h100, shard: 3, num_shards: 5, runner: linux.aws.h100}, {config: inductor_huggingface_perf_cuda_h100, shard: 4, num_shards: 5, runner: linux.aws.h100}, {config: inductor_huggingface_perf_cuda_h100, shard: 5, num_shards: 5, runner: linux.aws.h100}, {config: inductor_timm_perf_cuda_h100, shard: 1, num_shards: 7, runner: linux.aws.h100}, {config: inductor_timm_perf_cuda_h100, shard: 2, num_shards: 7, runner: linux.aws.h100}, {config: inductor_timm_perf_cuda_h100, shard: 3, num_shards: 7, runner: linux.aws.h100}, {config: inductor_timm_perf_cuda_h100, shard: 4, num_shards: 7, runner: linux.aws.h100}, {config: inductor_timm_perf_cuda_h100, shard: 5, num_shards: 7, runner: linux.aws.h100}, {config: inductor_timm_perf_cuda_h100, shard: 6, num_shards: 7, runner: linux.aws.h100}, {config: inductor_timm_perf_cuda_h100, shard: 7, num_shards: 7, runner: linux.aws.h100}, {config: inductor_torchbench_perf_cuda_h100, shard: 1, num_shards: 9, runner: linux.aws.h100}, {config: inductor_torchbench_perf_cuda_h100, shard: 2, num_shards: 9, runner: linux.aws.h100}, {config: inductor_torchbench_perf_cuda_h100, shard: 3, num_shards: 9, runner: linux.aws.h100}, {config: inductor_torchbench_perf_cuda_h100, shard: 4, num_shards: 9, runner: linux.aws.h100}, {config: inductor_torchbench_perf_cuda_h100, shard: 5, num_shards: 9, runner: linux.aws.h100}, {config: inductor_torchbench_perf_cuda_h100, shard: 6, num_shards: 9, runner: linux.aws.h100}, {config: inductor_torchbench_perf_cuda_h100, shard: 7, num_shards: 9, runner: linux.aws.h100}, {config: inductor_torchbench_perf_cuda_h100, shard: 8, num_shards: 9, runner: linux.aws.h100}, {config: inductor_torchbench_perf_cuda_h100, shard: 9, num_shards: 9, runner: linux.aws.h100}]} 2025-09-07T08:09:39.1483683Z 2025-09-07T08:09:39.1483801Z Is the current job unstable? False 2025-09-07T08:09:39.1483987Z 2025-09-07T08:09:39.1484082Z Is keep-going label set? True 2025-09-07T08:09:39.1484247Z 2025-09-07T08:09:39.1484331Z Reenabled issues? 2025-09-07T08:09:39.3740920Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-09-07T08:09:39.3741448Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-09-07T08:09:39.3756481Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:09:39.3756773Z env: 2025-09-07T08:09:39.3756938Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:39.3757199Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:39.3757540Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:39.3757813Z JOB_TIMEOUT: 1440 2025-09-07T08:09:39.3757988Z ##[endgroup] 2025-09-07T08:09:39.4537969Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T08:09:39.4538611Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T08:09:39.4538969Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-09-07T08:09:39.4552320Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T08:09:39.4552616Z env: 2025-09-07T08:09:39.4552782Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:39.4553039Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:39.4553384Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:39.4553664Z ##[endgroup] 2025-09-07T08:09:39.6727457Z ##[group]Run set -x 2025-09-07T08:09:39.6727801Z set -x 2025-09-07T08:09:39.6727996Z  2025-09-07T08:09:39.6728203Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2025-09-07T08:09:39.6728529Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2025-09-07T08:09:39.6728862Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2025-09-07T08:09:39.6729173Z  TEST_COMMAND=.ci/onnx/test.sh 2025-09-07T08:09:39.6729413Z else 2025-09-07T08:09:39.6729626Z  TEST_COMMAND=.ci/pytorch/test.sh 2025-09-07T08:09:39.6729878Z fi 2025-09-07T08:09:39.6730044Z  2025-09-07T08:09:39.6730249Z # Leaving 1GB for the runner and other things 2025-09-07T08:09:39.6730725Z TOTAL_AVAILABLE_MEMORY_IN_GB=$(awk '/MemTotal/ { printf "%.3f \n", $2/1024/1024 - 1 }' /proc/meminfo) 2025-09-07T08:09:39.6731436Z # https://docs.docker.com/engine/containers/resource_constraints/#--memory-swap-details, the 3GB swap 2025-09-07T08:09:39.6732010Z # comes from https://github.com/pytorch/test-infra/pull/6058 2025-09-07T08:09:39.6732446Z TOTAL_MEMORY_WITH_SWAP=$(("${TOTAL_AVAILABLE_MEMORY_IN_GB%.*}" + 3)) 2025-09-07T08:09:39.6732786Z  2025-09-07T08:09:39.6732997Z if [[ ${BUILD_ENVIRONMENT} == *"s390x"* ]]; then 2025-09-07T08:09:39.6733276Z  SHM_OPTS= 2025-09-07T08:09:39.6733476Z  JENKINS_USER= 2025-09-07T08:09:39.6733766Z  # ensure that docker container cleanly exits in 12 hours 2025-09-07T08:09:39.6734155Z  # if for some reason cleanup action doesn't stop container 2025-09-07T08:09:39.6734487Z  # when job is cancelled 2025-09-07T08:09:39.6734742Z  DOCKER_SHELL_CMD="sleep 12h" 2025-09-07T08:09:39.6735135Z else 2025-09-07T08:09:39.6735341Z  SHM_OPTS="--shm-size=${SHM_SIZE}" 2025-09-07T08:09:39.6735592Z  JENKINS_USER="--user jenkins" 2025-09-07T08:09:39.6735822Z  DOCKER_SHELL_CMD= 2025-09-07T08:09:39.6736006Z fi 2025-09-07T08:09:39.6736158Z  2025-09-07T08:09:39.6736399Z # detached container should get cleaned up by teardown_ec2_linux 2025-09-07T08:09:39.6736779Z # TODO: Stop building test binaries as part of the build phase 2025-09-07T08:09:39.6737203Z # Used for GPU_FLAG, SHM_OPTS, JENKINS_USER and DOCKER_SHELL_CMD since that doesn't play nice 2025-09-07T08:09:39.6737593Z # shellcheck disable=SC2086,SC2090 2025-09-07T08:09:39.6737839Z container_name=$(docker run \ 2025-09-07T08:09:39.6738066Z  ${GPU_FLAG:-} \ 2025-09-07T08:09:39.6738291Z  ${SCCACHE_SERVER_PORT_DOCKER_FLAG:-} \ 2025-09-07T08:09:39.6738532Z  -e BUILD_ENVIRONMENT \ 2025-09-07T08:09:39.6738748Z  -e PR_NUMBER \ 2025-09-07T08:09:39.6738948Z  -e GITHUB_ACTIONS \ 2025-09-07T08:09:39.6739158Z  -e GITHUB_REPOSITORY \ 2025-09-07T08:09:39.6739368Z  -e GITHUB_WORKFLOW \ 2025-09-07T08:09:39.6739573Z  -e GITHUB_JOB \ 2025-09-07T08:09:39.6739774Z  -e GITHUB_RUN_ID \ 2025-09-07T08:09:39.6739977Z  -e GITHUB_RUN_NUMBER \ 2025-09-07T08:09:39.6740184Z  -e GITHUB_RUN_ATTEMPT \ 2025-09-07T08:09:39.6740396Z  -e JOB_ID \ 2025-09-07T08:09:39.6740590Z  -e JOB_NAME \ 2025-09-07T08:09:39.6741018Z  -e BASE_SHA \ 2025-09-07T08:09:39.6741196Z  -e BRANCH \ 2025-09-07T08:09:39.6741472Z  -e SHA1 \ 2025-09-07T08:09:39.6741660Z  -e AWS_DEFAULT_REGION \ 2025-09-07T08:09:39.6741875Z  -e IN_WHEEL_TEST \ 2025-09-07T08:09:39.6742068Z  -e SHARD_NUMBER \ 2025-09-07T08:09:39.6742269Z  -e TEST_CONFIG \ 2025-09-07T08:09:39.6742465Z  -e NUM_TEST_SHARDS \ 2025-09-07T08:09:39.6742682Z  -e REENABLED_ISSUES \ 2025-09-07T08:09:39.6742898Z  -e CONTINUE_THROUGH_ERROR \ 2025-09-07T08:09:39.6743324Z  -e VERBOSE_TEST_LOGS \ 2025-09-07T08:09:39.6743556Z  -e TEST_SHOWLOCALS \ 2025-09-07T08:09:39.6743773Z  -e NO_TEST_TIMEOUT \ 2025-09-07T08:09:39.6743969Z  -e NO_TD \ 2025-09-07T08:09:39.6744161Z  -e TD_DISTRIBUTED \ 2025-09-07T08:09:39.6744368Z  -e PR_LABELS \ 2025-09-07T08:09:39.6744588Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2025-09-07T08:09:39.6744831Z  -e SCCACHE_BUCKET \ 2025-09-07T08:09:39.6745180Z  -e SCCACHE_REGION \ 2025-09-07T08:09:39.6745392Z  -e XLA_CUDA \ 2025-09-07T08:09:39.6745617Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2025-09-07T08:09:39.6745898Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2025-09-07T08:09:39.6746169Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2025-09-07T08:09:39.6746442Z  -e SKIP_SCCACHE_INITIALIZATION=1 \ 2025-09-07T08:09:39.6746695Z  -e HUGGING_FACE_HUB_TOKEN \ 2025-09-07T08:09:39.6746957Z  -e VLLM_TEST_HUGGING_FACE_TOKEN \ 2025-09-07T08:09:39.6747206Z  -e SCRIBE_GRAPHQL_ACCESS_TOKEN \ 2025-09-07T08:09:39.6747450Z  -e DASHBOARD_TAG \ 2025-09-07T08:09:39.6747668Z  -e ARTIFACTS_FILE_SUFFIX \ 2025-09-07T08:09:39.6747939Z  --memory="${TOTAL_AVAILABLE_MEMORY_IN_GB%.*}g" \ 2025-09-07T08:09:39.6748248Z  --memory-swap="${TOTAL_MEMORY_WITH_SWAP}g" \ 2025-09-07T08:09:39.6748559Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2025-09-07T08:09:39.6748850Z  --security-opt seccomp=unconfined \ 2025-09-07T08:09:39.6749105Z  --cap-add=SYS_PTRACE \ 2025-09-07T08:09:39.6749328Z  --ipc=host \ 2025-09-07T08:09:39.6749515Z  ${SHM_OPTS} \ 2025-09-07T08:09:39.6749712Z  --tty \ 2025-09-07T08:09:39.6749892Z  --detach \ 2025-09-07T08:09:39.6750093Z  --name="${container_name}" \ 2025-09-07T08:09:39.6750324Z  ${JENKINS_USER} \ 2025-09-07T08:09:39.6750595Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2025-09-07T08:09:39.6750890Z  -w /var/lib/jenkins/workspace \ 2025-09-07T08:09:39.6751131Z  "${DOCKER_IMAGE}" \ 2025-09-07T08:09:39.6751340Z  ${DOCKER_SHELL_CMD} 2025-09-07T08:09:39.6751538Z ) 2025-09-07T08:09:39.6751762Z # Propagate download.pytorch.org IP to container 2025-09-07T08:09:39.6752255Z grep download.pytorch.org /etc/hosts | docker exec -i "${container_name}" sudo bash -c "/bin/cat >> /etc/hosts" 2025-09-07T08:09:39.6752761Z echo "DOCKER_CONTAINER_ID=${container_name}" >> "${GITHUB_ENV}" 2025-09-07T08:09:39.6753050Z  2025-09-07T08:09:39.6753244Z if [[ ${BUILD_ENVIRONMENT} == *"s390x"* ]]; then 2025-09-07T08:09:39.6753657Z  docker exec -t "${container_name}" sh -c "python3 -m pip install -r .ci/docker/requirements-ci.txt" 2025-09-07T08:09:39.6754027Z fi 2025-09-07T08:09:39.6754175Z  2025-09-07T08:09:39.6754527Z docker exec -t "${container_name}" sh -c "python3 -m pip install $(echo dist/*.whl)[opt-einsum] && ${TEST_COMMAND}" 2025-09-07T08:09:39.6768867Z shell: /usr/bin/bash -e {0} 2025-09-07T08:09:39.6769082Z env: 2025-09-07T08:09:39.6769247Z GIT_DEFAULT_BRANCH: main 2025-09-07T08:09:39.6769494Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:09:39.6770013Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T08:09:39.6770368Z BUILD_ENVIRONMENT: linux-jammy-cuda12.8-py3.10-gcc9-sm90 2025-09-07T08:09:39.6770650Z PR_NUMBER: 2025-09-07T08:09:39.6770827Z GITHUB_REPOSITORY: pytorch/pytorch 2025-09-07T08:09:39.6771087Z GITHUB_WORKFLOW: inductor-perf-nightly-h100 2025-09-07T08:09:39.6771339Z GITHUB_JOB: test 2025-09-07T08:09:39.6771525Z GITHUB_RUN_ID: 17525296438 2025-09-07T08:09:39.6771720Z GITHUB_RUN_NUMBER: 662 2025-09-07T08:09:39.6771912Z GITHUB_RUN_ATTEMPT: 1 2025-09-07T08:09:39.6772095Z JOB_ID: 49775781836 2025-09-07T08:09:39.6772539Z JOB_NAME: test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T08:09:39.6772865Z BRANCH: main 2025-09-07T08:09:39.6773064Z SHA1: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:09:39.6773338Z BASE_SHA: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:09:39.6773605Z TEST_CONFIG: inductor_timm_perf_cuda_h100 2025-09-07T08:09:39.6773835Z SHARD_NUMBER: 7 2025-09-07T08:09:39.6774016Z NUM_TEST_SHARDS: 7 2025-09-07T08:09:39.6774193Z REENABLED_ISSUES: 2025-09-07T08:09:39.6774378Z CONTINUE_THROUGH_ERROR: True 2025-09-07T08:09:39.6774586Z VERBOSE_TEST_LOGS: False 2025-09-07T08:09:39.6774775Z TEST_SHOWLOCALS: False 2025-09-07T08:09:39.6775108Z NO_TEST_TIMEOUT: False 2025-09-07T08:09:39.6775296Z NO_TD: False 2025-09-07T08:09:39.6775466Z TD_DISTRIBUTED: False 2025-09-07T08:09:39.6775684Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2025-09-07T08:09:39.6775948Z SCCACHE_REGION: us-east-1 2025-09-07T08:09:39.6776150Z SHM_SIZE: 2g 2025-09-07T08:09:39.6776799Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T08:09:39.6777461Z XLA_CUDA: 2025-09-07T08:09:39.6777728Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2025-09-07T08:09:39.6778054Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 0 2025-09-07T08:09:39.6778292Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2025-09-07T08:09:39.6779182Z DASHBOARD_TAG: training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true 2025-09-07T08:09:39.6780244Z VLLM_TEST_HUGGING_FACE_TOKEN: *** 2025-09-07T08:09:39.6780541Z HUGGING_FACE_HUB_TOKEN: *** 2025-09-07T08:09:39.6780841Z SCRIBE_GRAPHQL_ACCESS_TOKEN: *** 2025-09-07T08:09:39.6781195Z ARTIFACTS_FILE_SUFFIX: test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836 2025-09-07T08:09:39.6781642Z ##[endgroup] 2025-09-07T08:09:39.7199410Z + [[ inductor_timm_perf_cuda_h100 == \m\u\l\t\i\g\p\u ]] 2025-09-07T08:09:39.7199776Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *onnx* ]] 2025-09-07T08:09:39.7200088Z + TEST_COMMAND=.ci/pytorch/test.sh 2025-09-07T08:09:39.7203167Z ++ awk '/MemTotal/ { printf "%.3f \n", $2/1024/1024 - 1 }' /proc/meminfo 2025-09-07T08:09:39.7214545Z + TOTAL_AVAILABLE_MEMORY_IN_GB='1998.949 ' 2025-09-07T08:09:39.7214843Z + TOTAL_MEMORY_WITH_SWAP=2001 2025-09-07T08:09:39.7215315Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *\s\3\9\0\x* ]] 2025-09-07T08:09:39.7215646Z + SHM_OPTS=--shm-size=2g 2025-09-07T08:09:39.7215867Z + JENKINS_USER='--user jenkins' 2025-09-07T08:09:39.7216087Z + DOCKER_SHELL_CMD= 2025-09-07T08:09:39.7223725Z +++ nproc --ignore=2 2025-09-07T08:09:39.7238144Z ++ docker run --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all -e SCCACHE_SERVER_PORT=5230 -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e TD_DISTRIBUTED -e PR_LABELS -e MAX_JOBS=22 -e SCCACHE_BUCKET -e SCCACHE_REGION -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e SKIP_SCCACHE_INITIALIZATION=1 -e HUGGING_FACE_HUB_TOKEN -e VLLM_TEST_HUGGING_FACE_TOKEN -e SCRIBE_GRAPHQL_ACCESS_TOKEN -e DASHBOARD_TAG -e ARTIFACTS_FILE_SUFFIX --memory=1998g --memory-swap=2001g --env-file=/tmp/github_env_17525296438 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/david/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks-ae53c6842aa4c2407d0ad976491ca941c2635c77 2025-09-07T08:15:14.0068446Z + container_name=146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T08:15:14.0073043Z + grep download.pytorch.org /etc/hosts 2025-09-07T08:15:14.0075455Z + docker exec -i 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f sudo bash -c '/bin/cat >> /etc/hosts' 2025-09-07T08:15:14.0604702Z + echo DOCKER_CONTAINER_ID=146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T08:15:14.0607389Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *\s\3\9\0\x* ]] 2025-09-07T08:15:14.0610604Z ++ echo dist/torch-2.9.0a0+git93fb23d-cp310-cp310-linux_x86_64.whl 2025-09-07T08:15:14.0613895Z + docker exec -t 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f sh -c 'python3 -m pip install dist/torch-2.9.0a0+git93fb23d-cp310-cp310-linux_x86_64.whl[opt-einsum] && .ci/pytorch/test.sh' 2025-09-07T08:15:14.4579009Z Processing ./dist/torch-2.9.0a0+git93fb23d-cp310-cp310-linux_x86_64.whl (from torch==2.9.0a0+git93fb23d) 2025-09-07T08:15:14.7563350Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (3.19.1) 2025-09-07T08:15:14.7566439Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (4.15.0) 2025-09-07T08:15:14.7570027Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (1.13.3) 2025-09-07T08:15:14.7573829Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (2.8.8) 2025-09-07T08:15:14.7576977Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (3.1.6) 2025-09-07T08:15:14.7580773Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (2025.3.0) 2025-09-07T08:15:14.7593209Z Requirement already satisfied: opt-einsum>=3.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (3.3.0) 2025-09-07T08:15:14.7926182Z Requirement already satisfied: numpy>=1.7 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from opt-einsum>=3.3->torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (1.22.4) 2025-09-07T08:15:14.7943683Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (1.3.0) 2025-09-07T08:15:14.7976336Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.9.0a0+git93fb23d->torch==2.9.0a0+git93fb23d) (3.0.2) 2025-09-07T08:15:15.6353028Z Installing collected packages: torch 2025-09-07T08:15:24.9417104Z ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. 2025-09-07T08:15:24.9418473Z dall-e 0.1 requires torchvision, which is not installed. 2025-09-07T08:15:24.9418895Z effdet 0.4.1 requires torchvision, which is not installed. 2025-09-07T08:15:24.9419346Z python-doctr 1.0.0 requires torchvision>=0.15.0, which is not installed. 2025-09-07T08:15:24.9419890Z pytorch-labs-segment-anything-fast 0.2 requires torchao, which is not installed. 2025-09-07T08:15:24.9420552Z pytorch-labs-segment-anything-fast 0.2 requires torchvision>=0.17.0.dev20231026, which is not installed. 2025-09-07T08:15:24.9421231Z timm 1.0.14 requires torchvision, which is not installed. 2025-09-07T08:15:24.9422086Z Successfully installed torch-2.9.0a0+git93fb23d 2025-09-07T08:15:25.0102547Z + export TERM=vt100 2025-09-07T08:15:25.0102828Z + TERM=vt100 2025-09-07T08:15:25.0105460Z ++ dirname .ci/pytorch/test.sh 2025-09-07T08:15:25.0116641Z + source .ci/pytorch/common.sh 2025-09-07T08:15:25.0120832Z +++ dirname .ci/pytorch/common.sh 2025-09-07T08:15:25.0130235Z ++ source .ci/pytorch/common_utils.sh 2025-09-07T08:15:25.0130808Z +++ declare -f -t trap_add 2025-09-07T08:15:25.0134479Z ++ set -ex -o pipefail 2025-09-07T08:15:25.0134754Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *rocm* ]] 2025-09-07T08:15:25.0135234Z ++ BUILD_TEST_LIBTORCH=0 2025-09-07T08:15:25.0139246Z ++ dirname .ci/pytorch/test.sh 2025-09-07T08:15:25.0148307Z + source .ci/pytorch/common-build.sh 2025-09-07T08:15:25.0148994Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 != *win-* ]] 2025-09-07T08:15:25.0157018Z ++++ dirname .ci/pytorch/common-build.sh 2025-09-07T08:15:25.0164306Z +++ cd .ci/pytorch 2025-09-07T08:15:25.0164697Z +++ pwd -P 2025-09-07T08:15:25.0167157Z ++ script_dir=/var/lib/jenkins/workspace/.ci/pytorch 2025-09-07T08:15:25.0167542Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *-pch* ]] 2025-09-07T08:15:25.0167840Z ++ which sccache 2025-09-07T08:15:25.0183123Z ++ [[ -z ossci-compiler-cache-circleci-v2 ]] 2025-09-07T08:15:25.0183401Z ++ sccache --stop-server 2025-09-07T08:15:25.0210712Z ++ true 2025-09-07T08:15:25.0210935Z ++ rm -f /var/lib/jenkins/sccache_error.log 2025-09-07T08:15:25.0223562Z ++ trap_add sccache_epilogue EXIT 2025-09-07T08:15:25.0223804Z ++ trap_add_cmd=sccache_epilogue 2025-09-07T08:15:25.0224011Z ++ shift 2025-09-07T08:15:25.0224178Z ++ for trap_add_name in "$@" 2025-09-07T08:15:25.0230088Z ++++ trap -p EXIT 2025-09-07T08:15:25.0232203Z +++ eval 'extract_trap_cmd ' 2025-09-07T08:15:25.0232405Z ++++ extract_trap_cmd 2025-09-07T08:15:25.0232582Z ++++ printf '%s\n' '' 2025-09-07T08:15:25.0232787Z +++ printf '%s\n' sccache_epilogue 2025-09-07T08:15:25.0234373Z ++ trap -- ' 2025-09-07T08:15:25.0234568Z sccache_epilogue' EXIT 2025-09-07T08:15:25.0234787Z ++ [[ -n 1 ]] 2025-09-07T08:15:25.0235237Z ++ echo 'Skipping sccache server initialization, setting environment variables' 2025-09-07T08:15:25.0235765Z Skipping sccache server initialization, setting environment variables 2025-09-07T08:15:25.0236156Z ++ export SCCACHE_IDLE_TIMEOUT=0 2025-09-07T08:15:25.0236408Z ++ SCCACHE_IDLE_TIMEOUT=0 2025-09-07T08:15:25.0236707Z ++ export SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-09-07T08:15:25.0237096Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-09-07T08:15:25.0237447Z ++ export RUST_LOG=sccache::server=error 2025-09-07T08:15:25.0237722Z ++ RUST_LOG=sccache::server=error 2025-09-07T08:15:25.0237973Z ++ sccache --zero-stats 2025-09-07T08:15:25.1941265Z Statistics zeroed. 2025-09-07T08:15:25.1949226Z ++ which ccache 2025-09-07T08:15:25.1967664Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 != *rocm* ]] 2025-09-07T08:15:25.1968595Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 != *s390x* ]] 2025-09-07T08:15:25.1969166Z + [[ -d /var/lib/jenkins/workspace ]] 2025-09-07T08:15:25.1972401Z ++ stat -c %u /var/lib/jenkins/workspace 2025-09-07T08:15:25.1987877Z + WORKSPACE_ORIGINAL_OWNER_ID=1000 2025-09-07T08:15:25.1988241Z + trap_add cleanup_workspace EXIT 2025-09-07T08:15:25.1988596Z + trap_add_cmd=cleanup_workspace 2025-09-07T08:15:25.1988921Z + shift 2025-09-07T08:15:25.1989167Z + for trap_add_name in "$@" 2025-09-07T08:15:25.1996662Z +++ trap -p EXIT 2025-09-07T08:15:25.1998627Z ++ eval 'extract_trap_cmd trap -- '\'' 2025-09-07T08:15:25.1999012Z sccache_epilogue'\'' EXIT' 2025-09-07T08:15:25.1999327Z +++ extract_trap_cmd trap -- ' 2025-09-07T08:15:25.1999656Z sccache_epilogue' EXIT 2025-09-07T08:15:25.1999939Z +++ printf '%s\n' ' 2025-09-07T08:15:25.2000215Z sccache_epilogue' 2025-09-07T08:15:25.2000508Z ++ printf '%s\n' cleanup_workspace 2025-09-07T08:15:25.2000934Z + trap -- ' 2025-09-07T08:15:25.2001191Z sccache_epilogue 2025-09-07T08:15:25.2001471Z cleanup_workspace' EXIT 2025-09-07T08:15:25.2002112Z + sudo chown -R jenkins /var/lib/jenkins/workspace 2025-09-07T08:15:28.6945861Z + git config --global --add safe.directory /var/lib/jenkins/workspace 2025-09-07T08:15:28.6970230Z + echo 'Environment variables:' 2025-09-07T08:15:28.6970517Z Environment variables: 2025-09-07T08:15:28.6970748Z + env 2025-09-07T08:15:28.6980555Z GITHUB_WORKSPACE=/home/david/_work/pytorch/pytorch 2025-09-07T08:15:28.6980973Z CONTINUE_THROUGH_ERROR=True 2025-09-07T08:15:28.6981331Z BUILD_ENVIRONMENT=linux-jammy-cuda12.8-py3.10-gcc9-sm90 2025-09-07T08:15:28.6982169Z VLLM_TEST_HUGGING_FACE_TOKEN=*** 2025-09-07T08:15:28.6982436Z HOSTNAME=146d7de0d633 2025-09-07T08:15:28.6982886Z GITHUB_PATH=/home/david/_work/_temp/_runner_file_commands/add_path_bf7ffcd1-0c0b-4451-b4ba-80ac3370ce6f 2025-09-07T08:15:28.6983417Z GITHUB_ACTION=__run_2 2025-09-07T08:15:28.6983653Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-09-07T08:15:28.6983888Z GITHUB_RUN_NUMBER=662 2025-09-07T08:15:28.6984099Z TEST_CONFIG=inductor_timm_perf_cuda_h100 2025-09-07T08:15:28.6984388Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-09-07T08:15:28.6984643Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2025-09-07T08:15:28.6984896Z SCCACHE_IDLE_TIMEOUT=0 2025-09-07T08:15:28.6985395Z SCRIBE_GRAPHQL_ACCESS_TOKEN=*** 2025-09-07T08:15:28.6985646Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-09-07T08:15:28.6985894Z GITHUB_REF_TYPE=branch 2025-09-07T08:15:28.6986130Z BASE_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:15:28.6986400Z XLA_CUDA= 2025-09-07T08:15:28.6986580Z NCCL_LIB_DIR=/usr/local/cuda/lib64/ 2025-09-07T08:15:28.6986894Z HUGGING_FACE_HUB_TOKEN=*** 2025-09-07T08:15:28.6987258Z *** 2025-09-07T08:15:28.6987431Z GITHUB_REPOSITORY_ID=65600975 2025-09-07T08:15:28.6987658Z GITHUB_ACTIONS=true 2025-09-07T08:15:28.6987858Z NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:15:28.6988119Z SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-09-07T08:15:28.6988431Z SHA1=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:15:28.6988732Z GITHUB_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:15:28.6989259Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/inductor-perf-test-nightly-h100.yml@refs/heads/main 2025-09-07T08:15:28.6989727Z UCC_HOME=/usr 2025-09-07T08:15:28.6989906Z VERBOSE_TEST_LOGS=False 2025-09-07T08:15:28.6990115Z GITHUB_REF=refs/heads/main 2025-09-07T08:15:28.6990322Z SHARD_NUMBER=7 2025-09-07T08:15:28.6990501Z GITHUB_REF_PROTECTED=true 2025-09-07T08:15:28.6990712Z HOME=/var/lib/jenkins 2025-09-07T08:15:28.6990911Z SCCACHE_SERVER_PORT=5230 2025-09-07T08:15:28.6991142Z GITHUB_API_URL=https://api.github.com 2025-09-07T08:15:28.6991399Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-09-07T08:15:28.6991669Z UCX_COMMIT=7836b165abdbe468a2f607e7254011c07d788152 2025-09-07T08:15:28.6991941Z USE_SYSTEM_NCCL=1 2025-09-07T08:15:28.6992128Z NUM_TEST_SHARDS=7 2025-09-07T08:15:28.6992298Z UCX_HOME=/usr 2025-09-07T08:15:28.6992686Z GITHUB_STATE=/home/david/_work/_temp/_runner_file_commands/save_state_bf7ffcd1-0c0b-4451-b4ba-80ac3370ce6f 2025-09-07T08:15:28.6993263Z JOB_NAME=test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T08:15:28.6993808Z GITHUB_ENV=/home/david/_work/_temp/_runner_file_commands/set_env_bf7ffcd1-0c0b-4451-b4ba-80ac3370ce6f 2025-09-07T08:15:28.6994338Z GITHUB_EVENT_PATH=/home/david/_work/_temp/_github_workflow/event.json 2025-09-07T08:15:28.6994684Z GITHUB_EVENT_NAME=schedule 2025-09-07T08:15:28.6995743Z DASHBOARD_TAG=training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true 2025-09-07T08:15:28.6997136Z GITHUB_RUN_ID=17525296438 2025-09-07T08:15:28.6997325Z INSTALLED_OPENBLAS= 2025-09-07T08:15:28.6997706Z GITHUB_STEP_SUMMARY=/home/david/_work/_temp/_runner_file_commands/step_summary_bf7ffcd1-0c0b-4451-b4ba-80ac3370ce6f 2025-09-07T08:15:28.6998143Z GITHUB_ACTOR=pytorchmergebot 2025-09-07T08:15:28.6998339Z PR_NUMBER= 2025-09-07T08:15:28.6998496Z DESIRED_CUDA=12.8.1 2025-09-07T08:15:28.6998664Z GITHUB_RUN_ATTEMPT=1 2025-09-07T08:15:28.6999065Z ANACONDA_PYTHON_VERSION=3.10 2025-09-07T08:15:28.6999315Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-09-07T08:15:28.6999561Z TERM=vt100 2025-09-07T08:15:28.6999709Z INSTALLED_VISION=yes 2025-09-07T08:15:28.6999891Z BRANCH=main 2025-09-07T08:15:28.7000054Z SCCACHE_REGION=us-east-1 2025-09-07T08:15:28.7000252Z OPENSSL_ROOT_DIR=/opt/openssl 2025-09-07T08:15:28.7000452Z CUDA_PATH=/usr/local/cuda 2025-09-07T08:15:28.7000768Z GITHUB_ACTION_PATH=/home/david/_work/pytorch/pytorch/./.github/actions/setup-linux 2025-09-07T08:15:28.7001124Z GITHUB_SERVER_URL=https://github.com 2025-09-07T08:15:28.7001379Z UCC_COMMIT=430e241bf5d38cbc73fc7a6b89155397232e3f96 2025-09-07T08:15:28.7001614Z REENABLED_ISSUES= 2025-09-07T08:15:28.7001773Z DOCS= 2025-09-07T08:15:28.7001914Z SHLVL=1 2025-09-07T08:15:28.7002074Z MAX_JOBS=22 2025-09-07T08:15:28.7002227Z GITHUB_ACTOR_ID=97764156 2025-09-07T08:15:28.7002475Z GITHUB_WORKFLOW_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:15:28.7002752Z GITHUB_REF_NAME=main 2025-09-07T08:15:28.7003025Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2025-09-07T08:15:28.7003321Z GITHUB_JOB=test 2025-09-07T08:15:28.7003488Z NO_TEST_TIMEOUT=False 2025-09-07T08:15:28.7003666Z TD_DISTRIBUTED=False 2025-09-07T08:15:28.7003856Z GITHUB_REPOSITORY=pytorch/pytorch 2025-09-07T08:15:28.7004065Z GITHUB_RETENTION_DAYS=90 2025-09-07T08:15:28.7004260Z OPENSSL_DIR=/opt/openssl 2025-09-07T08:15:28.7004470Z GITHUB_ACTION_REPOSITORY= 2025-09-07T08:15:28.7005167Z PATH=/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-09-07T08:15:28.7005737Z GITHUB_BASE_REF= 2025-09-07T08:15:28.7005907Z INSTALLED_ACL= 2025-09-07T08:15:28.7006209Z ARTIFACTS_FILE_SUFFIX=test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836 2025-09-07T08:15:28.7006550Z CI=true 2025-09-07T08:15:28.7006711Z GITHUB_REPOSITORY_OWNER=pytorch 2025-09-07T08:15:28.7006962Z RUST_LOG=sccache::server=error 2025-09-07T08:15:28.7007161Z JOB_ID=49775781836 2025-09-07T08:15:28.7007326Z GITHUB_HEAD_REF= 2025-09-07T08:15:28.7007483Z GITHUB_ACTION_REF= 2025-09-07T08:15:28.7007703Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2025-09-07T08:15:28.7007973Z TEST_SHOWLOCALS=False 2025-09-07T08:15:28.7008188Z GITHUB_WORKFLOW=inductor-perf-nightly-h100 2025-09-07T08:15:28.7008437Z DEBIAN_FRONTEND=noninteractive 2025-09-07T08:15:28.7008831Z GITHUB_OUTPUT=/home/david/_work/_temp/_runner_file_commands/set_output_bf7ffcd1-0c0b-4451-b4ba-80ac3370ce6f 2025-09-07T08:15:28.7009227Z NO_TD=False 2025-09-07T08:15:28.7009396Z SKIP_SCCACHE_INITIALIZATION=1 2025-09-07T08:15:28.7009613Z NCCL_INCLUDE_DIR=/usr/local/cuda/include/ 2025-09-07T08:15:28.7009832Z _=/usr/bin/env 2025-09-07T08:15:28.7010065Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2025-09-07T08:15:28.7258797Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2025-09-07T08:15:28.7259356Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-09-07T08:15:28.7259866Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2025-09-07T08:15:28.7260373Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2025-09-07T08:15:28.7260769Z + BUILD_DIR=build 2025-09-07T08:15:28.7260994Z + BUILD_RENAMED_DIR=build_renamed 2025-09-07T08:15:28.7261619Z + BUILD_BIN_DIR=build/bin 2025-09-07T08:15:28.7261854Z + SHARD_NUMBER=7 2025-09-07T08:15:28.7262057Z + NUM_TEST_SHARDS=7 2025-09-07T08:15:28.7262293Z + export TORCH_SERIALIZATION_DEBUG=1 2025-09-07T08:15:28.7262596Z + TORCH_SERIALIZATION_DEBUG=1 2025-09-07T08:15:28.7262846Z + export VALGRIND=ON 2025-09-07T08:15:28.7263057Z + VALGRIND=ON 2025-09-07T08:15:28.7263342Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *clang9* ]] 2025-09-07T08:15:28.7263749Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *xpu* ]] 2025-09-07T08:15:28.7264033Z + detect_cuda_arch 2025-09-07T08:15:28.7264410Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *cuda* ]] 2025-09-07T08:15:28.7264704Z + command -v nvidia-smi 2025-09-07T08:15:28.7264910Z /usr/bin/nvidia-smi 2025-09-07T08:15:28.7270808Z ++ nvidia-smi --query-gpu=compute_cap --format=csv 2025-09-07T08:15:28.7272356Z ++ tail -n 1 2025-09-07T08:15:28.7487679Z + TORCH_CUDA_ARCH_LIST=9.0 2025-09-07T08:15:28.7488087Z + export TORCH_CUDA_ARCH_LIST 2025-09-07T08:15:28.7488575Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *s390x* ]] 2025-09-07T08:15:28.7489040Z + [[ 0 == \1 ]] 2025-09-07T08:15:28.7489329Z + [[ True == \1 ]] 2025-09-07T08:15:28.7489718Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 != *bazel* ]] 2025-09-07T08:15:28.7492715Z ++ realpath build/custom_test_artifacts 2025-09-07T08:15:28.7503424Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/workspace/build/custom_test_artifacts 2025-09-07T08:15:28.7504111Z + [[ -n '' ]] 2025-09-07T08:15:28.7504429Z + echo 'Environment variables' 2025-09-07T08:15:28.7504807Z Environment variables 2025-09-07T08:15:28.7505478Z + env 2025-09-07T08:15:28.7511345Z GITHUB_WORKSPACE=/home/david/_work/pytorch/pytorch 2025-09-07T08:15:28.7511867Z CONTINUE_THROUGH_ERROR=True 2025-09-07T08:15:28.7512338Z BUILD_ENVIRONMENT=linux-jammy-cuda12.8-py3.10-gcc9-sm90 2025-09-07T08:15:28.7513078Z VLLM_TEST_HUGGING_FACE_TOKEN=*** 2025-09-07T08:15:28.7513467Z HOSTNAME=146d7de0d633 2025-09-07T08:15:28.7531244Z GITHUB_PATH=/home/david/_work/_temp/_runner_file_commands/add_path_bf7ffcd1-0c0b-4451-b4ba-80ac3370ce6f 2025-09-07T08:15:28.7532004Z GITHUB_ACTION=__run_2 2025-09-07T08:15:28.7532365Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-09-07T08:15:28.7532755Z GITHUB_RUN_NUMBER=662 2025-09-07T08:15:28.7533118Z TEST_CONFIG=inductor_timm_perf_cuda_h100 2025-09-07T08:15:28.7533559Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-09-07T08:15:28.7533990Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2025-09-07T08:15:28.7534411Z SCCACHE_IDLE_TIMEOUT=0 2025-09-07T08:15:28.7535163Z SCRIBE_GRAPHQL_ACCESS_TOKEN=*** 2025-09-07T08:15:28.7535606Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-09-07T08:15:28.7536063Z GITHUB_REF_TYPE=branch 2025-09-07T08:15:28.7536398Z TORCH_CUDA_ARCH_LIST=9.0 2025-09-07T08:15:28.7536784Z BASE_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:15:28.7537218Z XLA_CUDA= 2025-09-07T08:15:28.7537510Z NCCL_LIB_DIR=/usr/local/cuda/lib64/ 2025-09-07T08:15:28.7538063Z HUGGING_FACE_HUB_TOKEN=*** 2025-09-07T08:15:28.7538602Z *** 2025-09-07T08:15:28.7538886Z GITHUB_REPOSITORY_ID=65600975 2025-09-07T08:15:28.7539255Z GITHUB_ACTIONS=true 2025-09-07T08:15:28.7539581Z NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T08:15:28.7540017Z SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-09-07T08:15:28.7540515Z SHA1=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:15:28.7541006Z GITHUB_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:15:28.7541937Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/inductor-perf-test-nightly-h100.yml@refs/heads/main 2025-09-07T08:15:28.7542728Z UCC_HOME=/usr 2025-09-07T08:15:28.7543030Z TORCH_SERIALIZATION_DEBUG=1 2025-09-07T08:15:28.7543444Z VERBOSE_TEST_LOGS=False 2025-09-07T08:15:28.7543793Z GITHUB_REF=refs/heads/main 2025-09-07T08:15:28.7544131Z SHARD_NUMBER=7 2025-09-07T08:15:28.7544430Z GITHUB_REF_PROTECTED=true 2025-09-07T08:15:28.7544760Z HOME=/var/lib/jenkins 2025-09-07T08:15:28.7545305Z SCCACHE_SERVER_PORT=5230 2025-09-07T08:15:28.7545695Z GITHUB_API_URL=https://api.github.com 2025-09-07T08:15:28.7546588Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-09-07T08:15:28.7547039Z UCX_COMMIT=7836b165abdbe468a2f607e7254011c07d788152 2025-09-07T08:15:28.7547486Z USE_SYSTEM_NCCL=1 2025-09-07T08:15:28.7547792Z NUM_TEST_SHARDS=7 2025-09-07T08:15:28.7548083Z UCX_HOME=/usr 2025-09-07T08:15:28.7548705Z GITHUB_STATE=/home/david/_work/_temp/_runner_file_commands/save_state_bf7ffcd1-0c0b-4451-b4ba-80ac3370ce6f 2025-09-07T08:15:28.7549653Z JOB_NAME=test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T08:15:28.7550870Z GITHUB_ENV=/home/david/_work/_temp/_runner_file_commands/set_env_bf7ffcd1-0c0b-4451-b4ba-80ac3370ce6f 2025-09-07T08:15:28.7551770Z GITHUB_EVENT_PATH=/home/david/_work/_temp/_github_workflow/event.json 2025-09-07T08:15:28.7552320Z GITHUB_EVENT_NAME=schedule 2025-09-07T08:15:28.7553943Z DASHBOARD_TAG=training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true 2025-09-07T08:15:28.7555839Z GITHUB_RUN_ID=17525296438 2025-09-07T08:15:28.7556187Z INSTALLED_OPENBLAS= 2025-09-07T08:15:28.7556904Z GITHUB_STEP_SUMMARY=/home/david/_work/_temp/_runner_file_commands/step_summary_bf7ffcd1-0c0b-4451-b4ba-80ac3370ce6f 2025-09-07T08:15:28.7557711Z GITHUB_ACTOR=pytorchmergebot 2025-09-07T08:15:28.7558069Z PR_NUMBER= 2025-09-07T08:15:28.7558349Z DESIRED_CUDA=12.8.1 2025-09-07T08:15:28.7558657Z GITHUB_RUN_ATTEMPT=1 2025-09-07T08:15:28.7558952Z VALGRIND=ON 2025-09-07T08:15:28.7559244Z ANACONDA_PYTHON_VERSION=3.10 2025-09-07T08:15:28.7559703Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-09-07T08:15:28.7560155Z TERM=vt100 2025-09-07T08:15:28.7560422Z INSTALLED_VISION=yes 2025-09-07T08:15:28.7560732Z BRANCH=main 2025-09-07T08:15:28.7561029Z SCCACHE_REGION=us-east-1 2025-09-07T08:15:28.7561388Z OPENSSL_ROOT_DIR=/opt/openssl 2025-09-07T08:15:28.7561748Z CUDA_PATH=/usr/local/cuda 2025-09-07T08:15:28.7562321Z GITHUB_ACTION_PATH=/home/david/_work/pytorch/pytorch/./.github/actions/setup-linux 2025-09-07T08:15:28.7562982Z GITHUB_SERVER_URL=https://github.com 2025-09-07T08:15:28.7563457Z UCC_COMMIT=430e241bf5d38cbc73fc7a6b89155397232e3f96 2025-09-07T08:15:28.7563891Z REENABLED_ISSUES= 2025-09-07T08:15:28.7564180Z DOCS= 2025-09-07T08:15:28.7564432Z SHLVL=1 2025-09-07T08:15:28.7564681Z MAX_JOBS=22 2025-09-07T08:15:28.7565158Z GITHUB_ACTOR_ID=97764156 2025-09-07T08:15:28.7565629Z GITHUB_WORKFLOW_SHA=93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T08:15:28.7566129Z GITHUB_REF_NAME=main 2025-09-07T08:15:28.7566616Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2025-09-07T08:15:28.7567188Z GITHUB_JOB=test 2025-09-07T08:15:28.7567490Z NO_TEST_TIMEOUT=False 2025-09-07T08:15:28.7567809Z TD_DISTRIBUTED=False 2025-09-07T08:15:28.7568137Z GITHUB_REPOSITORY=pytorch/pytorch 2025-09-07T08:15:28.7568528Z GITHUB_RETENTION_DAYS=90 2025-09-07T08:15:28.7568870Z OPENSSL_DIR=/opt/openssl 2025-09-07T08:15:28.7569217Z GITHUB_ACTION_REPOSITORY= 2025-09-07T08:15:28.7570219Z PATH=/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-09-07T08:15:28.7571272Z GITHUB_BASE_REF= 2025-09-07T08:15:28.7571574Z INSTALLED_ACL= 2025-09-07T08:15:28.7572110Z ARTIFACTS_FILE_SUFFIX=test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836 2025-09-07T08:15:28.7572738Z CI=true 2025-09-07T08:15:28.7573019Z GITHUB_REPOSITORY_OWNER=pytorch 2025-09-07T08:15:28.7573428Z RUST_LOG=sccache::server=error 2025-09-07T08:15:28.7573782Z JOB_ID=49775781836 2025-09-07T08:15:28.7574083Z GITHUB_HEAD_REF= 2025-09-07T08:15:28.7574373Z GITHUB_ACTION_REF= 2025-09-07T08:15:28.7574739Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2025-09-07T08:15:28.7575401Z TEST_SHOWLOCALS=False 2025-09-07T08:15:28.7575772Z GITHUB_WORKFLOW=inductor-perf-nightly-h100 2025-09-07T08:15:28.7576210Z DEBIAN_FRONTEND=noninteractive 2025-09-07T08:15:28.7576928Z GITHUB_OUTPUT=/home/david/_work/_temp/_runner_file_commands/set_output_bf7ffcd1-0c0b-4451-b4ba-80ac3370ce6f 2025-09-07T08:15:28.7578024Z NO_TD=False 2025-09-07T08:15:28.7578323Z SKIP_SCCACHE_INITIALIZATION=1 2025-09-07T08:15:28.7578710Z NCCL_INCLUDE_DIR=/usr/local/cuda/include/ 2025-09-07T08:15:28.7579109Z _=/usr/bin/env 2025-09-07T08:15:28.7579403Z + echo 'Testing pytorch' 2025-09-07T08:15:28.7579742Z Testing pytorch 2025-09-07T08:15:28.7580046Z + export LANG=C.UTF-8 2025-09-07T08:15:28.7580366Z + LANG=C.UTF-8 2025-09-07T08:15:28.7580650Z + PR_NUMBER= 2025-09-07T08:15:28.7580994Z + [[ inductor_timm_perf_cuda_h100 == \d\e\f\a\u\l\t ]] 2025-09-07T08:15:28.7581882Z + [[ inductor_timm_perf_cuda_h100 == \d\i\s\t\r\i\b\u\t\e\d ]] 2025-09-07T08:15:28.7582409Z + [[ inductor_timm_perf_cuda_h100 == \s\l\o\w ]] 2025-09-07T08:15:28.7582969Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *slow-gradcheck* ]] 2025-09-07T08:15:28.7583582Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *cuda* ]] 2025-09-07T08:15:28.7584080Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-09-07T08:15:28.7584550Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-09-07T08:15:28.7585197Z + [[ inductor_timm_perf_cuda_h100 == *crossref* ]] 2025-09-07T08:15:28.7585723Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *rocm* ]] 2025-09-07T08:15:28.7586247Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *xpu* ]] 2025-09-07T08:15:28.7586797Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 != *-bazel-* ]] 2025-09-07T08:15:28.7587289Z + pip_install ninja==1.10.2 2025-09-07T08:15:28.7587758Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-09-07T08:15:28.7588352Z + python3 -m pip install --progress-bar off ninja==1.10.2 2025-09-07T08:15:29.5712302Z Collecting ninja==1.10.2 2025-09-07T08:15:29.6102705Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2025-09-07T08:15:30.0951043Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2025-09-07T08:15:31.7805920Z Installing collected packages: ninja 2025-09-07T08:15:31.7807117Z Attempting uninstall: ninja 2025-09-07T08:15:31.7815067Z Found existing installation: ninja 1.11.1.3 2025-09-07T08:15:31.7836949Z Uninstalling ninja-1.11.1.3: 2025-09-07T08:15:32.3775892Z Successfully uninstalled ninja-1.11.1.3 2025-09-07T08:15:33.1019197Z Successfully installed ninja-1.10.2 2025-09-07T08:15:33.1710819Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-09-07T08:15:33.1712871Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-09-07T08:15:33.1714101Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *aarch64* ]] 2025-09-07T08:15:33.1714629Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *asan* ]] 2025-09-07T08:15:33.1715546Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *-debug* ]] 2025-09-07T08:15:33.1716118Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 != *-bazel-* ]] 2025-09-07T08:15:33.1716854Z + echo 'We are not in debug mode: linux-jammy-cuda12.8-py3.10-gcc9-sm90. Expect the assertion to pass' 2025-09-07T08:15:33.1717769Z We are not in debug mode: linux-jammy-cuda12.8-py3.10-gcc9-sm90. Expect the assertion to pass 2025-09-07T08:15:33.1718400Z + cd test 2025-09-07T08:15:33.1718824Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2025-09-07T08:15:33.6714620Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:15:33.6719768Z import pynvml # type: ignore[import] 2025-09-07T08:15:34.6512701Z + [[ inductor_timm_perf_cuda_h100 == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2025-09-07T08:15:34.6513169Z + [[ inductor_timm_perf_cuda_h100 == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2025-09-07T08:15:34.6514071Z + [[ inductor_timm_perf_cuda_h100 == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]] 2025-09-07T08:15:34.6515919Z + DYNAMO_BENCHMARK_FLAGS=() 2025-09-07T08:15:34.6516982Z + [[ inductor_timm_perf_cuda_h100 == *pr_time_benchmarks* ]] 2025-09-07T08:15:34.6517371Z + [[ inductor_timm_perf_cuda_h100 == *dynamo_eager* ]] 2025-09-07T08:15:34.6517704Z + [[ inductor_timm_perf_cuda_h100 == *aot_eager* ]] 2025-09-07T08:15:34.6518048Z + [[ inductor_timm_perf_cuda_h100 == *aot_inductor* ]] 2025-09-07T08:15:34.6518717Z + [[ inductor_timm_perf_cuda_h100 == *max_autotune_inductor* ]] 2025-09-07T08:15:34.6519079Z + [[ inductor_timm_perf_cuda_h100 == *inductor* ]] 2025-09-07T08:15:34.6519399Z + [[ inductor_timm_perf_cuda_h100 != *perf* ]] 2025-09-07T08:15:34.6519709Z + [[ inductor_timm_perf_cuda_h100 == *dynamic* ]] 2025-09-07T08:15:34.6520006Z + [[ inductor_timm_perf_cuda_h100 == *cpu* ]] 2025-09-07T08:15:34.6520307Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2025-09-07T08:15:34.6535818Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *libtorch* ]] 2025-09-07T08:15:34.6536424Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *-bazel-* ]] 2025-09-07T08:15:34.6539433Z + cd test 2025-09-07T08:15:34.6539929Z + python -c 'import torch; print(torch.__config__.show())' 2025-09-07T08:15:35.1830303Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:15:35.1831993Z import pynvml # type: ignore[import] 2025-09-07T08:15:37.5107439Z PyTorch built with: 2025-09-07T08:15:37.5107809Z - GCC 9.5 2025-09-07T08:15:37.5108098Z - C++ Version: 201703 2025-09-07T08:15:37.5108717Z - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-09-07T08:15:37.5109393Z - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-09-07T08:15:37.5109802Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-09-07T08:15:37.5110104Z - LAPACK is enabled (usually provided by MKL) 2025-09-07T08:15:37.5110406Z - NNPACK is enabled 2025-09-07T08:15:37.5110631Z - CPU capability usage: AVX2 2025-09-07T08:15:37.5110868Z - CUDA Runtime 12.8 2025-09-07T08:15:37.5111169Z - NVCC architecture flags: -gencode;arch=compute_90,code=sm_90 2025-09-07T08:15:37.5111510Z - CuDNN 90.8 2025-09-07T08:15:37.5116006Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=93fb23d6fae7c4e82c4239a1033e522088742634, CUDA_VERSION=12.8, CUDNN_VERSION=9.8.0, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Werror -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=ON, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 2025-09-07T08:15:37.5120057Z 2025-09-07T08:15:38.0813396Z + cd test 2025-09-07T08:15:38.0813740Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2025-09-07T08:15:38.6139669Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:15:38.6141831Z import pynvml # type: ignore[import] 2025-09-07T08:15:39.3381494Z ATen/Parallel: 2025-09-07T08:15:39.3381793Z at::get_num_threads() : 24 2025-09-07T08:15:39.3382033Z at::get_num_interop_threads() : 96 2025-09-07T08:15:39.3382280Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-09-07T08:15:39.3382504Z omp_get_max_threads() : 24 2025-09-07T08:15:39.3383457Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-09-07T08:15:39.3383936Z mkl_get_max_threads() : 24 2025-09-07T08:15:39.3384238Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-09-07T08:15:39.3384584Z std::thread::hardware_concurrency() : 192 2025-09-07T08:15:39.3384827Z Environment variables: 2025-09-07T08:15:39.3385353Z OMP_NUM_THREADS : [not set] 2025-09-07T08:15:39.3385577Z MKL_NUM_THREADS : [not set] 2025-09-07T08:15:39.3385793Z ATen parallel backend: OpenMP 2025-09-07T08:15:39.3385934Z 2025-09-07T08:15:39.5892493Z + [[ inductor_timm_perf_cuda_h100 == *numpy_2* ]] 2025-09-07T08:15:39.5892873Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *aarch64* ]] 2025-09-07T08:15:39.5893212Z + [[ inductor_timm_perf_cuda_h100 == *backward* ]] 2025-09-07T08:15:39.5893497Z + [[ inductor_timm_perf_cuda_h100 == *xla* ]] 2025-09-07T08:15:39.5893769Z + [[ inductor_timm_perf_cuda_h100 == *vllm* ]] 2025-09-07T08:15:39.5894057Z + [[ inductor_timm_perf_cuda_h100 == *executorch* ]] 2025-09-07T08:15:39.5894396Z + [[ inductor_timm_perf_cuda_h100 == \j\i\t\_\l\e\g\a\c\y ]] 2025-09-07T08:15:39.5894749Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *libtorch* ]] 2025-09-07T08:15:39.5895465Z + [[ inductor_timm_perf_cuda_h100 == distributed ]] 2025-09-07T08:15:39.5895782Z + [[ inductor_timm_perf_cuda_h100 == *operator_benchmark* ]] 2025-09-07T08:15:39.5896131Z + [[ inductor_timm_perf_cuda_h100 == *inductor_distributed* ]] 2025-09-07T08:15:39.5896480Z + [[ inductor_timm_perf_cuda_h100 == *inductor-halide* ]] 2025-09-07T08:15:39.5896836Z + [[ inductor_timm_perf_cuda_h100 == *inductor-triton-cpu* ]] 2025-09-07T08:15:39.5897199Z + [[ inductor_timm_perf_cuda_h100 == *inductor-micro-benchmark* ]] 2025-09-07T08:15:39.5897550Z + [[ inductor_timm_perf_cuda_h100 == *huggingface* ]] 2025-09-07T08:15:39.5897836Z + [[ inductor_timm_perf_cuda_h100 == *timm* ]] 2025-09-07T08:15:39.5898090Z + install_torchvision 2025-09-07T08:15:39.5898279Z + local orig_preload 2025-09-07T08:15:39.5898466Z + local commit 2025-09-07T08:15:39.5898655Z ++ get_pinned_commit vision 2025-09-07T08:15:39.5898901Z ++ cat .github/ci_commit_pins/vision.txt 2025-09-07T08:15:39.5914571Z + commit=966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-09-07T08:15:39.5914888Z + orig_preload= 2025-09-07T08:15:39.5915273Z + '[' -n '' ']' 2025-09-07T08:15:39.5915532Z + [[ linux-jammy-cuda12.8-py3.10-gcc9-sm90 == *cuda* ]] 2025-09-07T08:15:39.5915842Z + export FORCE_CUDA=1 2025-09-07T08:15:39.5916064Z + FORCE_CUDA=1 2025-09-07T08:15:39.5916253Z + export WITH_CUDA=1 2025-09-07T08:15:39.5916482Z + WITH_CUDA=1 2025-09-07T08:15:39.5917091Z + pip_build_and_install git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 dist/vision 2025-09-07T08:15:39.5917977Z + local build_target=git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-09-07T08:15:39.5918478Z + local wheel_dir=dist/vision 2025-09-07T08:15:39.5918723Z + local found_whl=0 2025-09-07T08:15:39.5918949Z + for file in "${wheel_dir}"/*.whl 2025-09-07T08:15:39.5919365Z + [[ -f dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl ]] 2025-09-07T08:15:39.5919777Z + found_whl=1 2025-09-07T08:15:39.5919967Z + break 2025-09-07T08:15:39.5920155Z + '[' 1 == 0 ']' 2025-09-07T08:15:39.5920374Z + for file in "${wheel_dir}"/*.whl 2025-09-07T08:15:39.5920803Z + pip_install_whl dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-09-07T08:15:39.5921757Z + args=('dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl') 2025-09-07T08:15:39.5922161Z + local args 2025-09-07T08:15:39.5922517Z + [[ dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl == *\ * ]] 2025-09-07T08:15:39.5922936Z + for path in "${args[@]}" 2025-09-07T08:15:39.5923354Z + echo 'Installing dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl' 2025-09-07T08:15:39.5923950Z Installing dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-09-07T08:15:39.5924837Z + python3 -mpip install --no-index --no-deps dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-09-07T08:15:39.9330290Z Processing ./dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-09-07T08:15:39.9417308Z Installing collected packages: torchvision 2025-09-07T08:15:40.3735188Z Successfully installed torchvision-0.22.0a0+966da7e 2025-09-07T08:15:40.4091286Z + '[' -n '' ']' 2025-09-07T08:15:40.4091552Z + id=6 2025-09-07T08:15:40.4091783Z + test_dynamo_benchmark timm_models 6 2025-09-07T08:15:40.4097704Z ++ pwd 2025-09-07T08:15:40.4100857Z + TEST_REPORTS_DIR=/var/lib/jenkins/workspace/test/test-reports 2025-09-07T08:15:40.4101288Z + local suite=timm_models 2025-09-07T08:15:40.4101619Z + shift 2025-09-07T08:15:40.4101812Z + local shard_id=6 2025-09-07T08:15:40.4102019Z + shift 2025-09-07T08:15:40.4102262Z + [[ inductor_timm_perf_cuda_h100 == *perf_compare* ]] 2025-09-07T08:15:40.4102612Z + [[ inductor_timm_perf_cuda_h100 == *perf* ]] 2025-09-07T08:15:40.4102936Z + [[ inductor_timm_perf_cuda_h100 == *b200* ]] 2025-09-07T08:15:40.4103290Z + test_single_dynamo_benchmark dashboard timm_models 6 2025-09-07T08:15:40.4106427Z ++ pwd 2025-09-07T08:15:40.4108398Z + TEST_REPORTS_DIR=/var/lib/jenkins/workspace/test/test-reports 2025-09-07T08:15:40.4108821Z + mkdir -p /var/lib/jenkins/workspace/test/test-reports 2025-09-07T08:15:40.4129077Z + local name=dashboard 2025-09-07T08:15:40.4129301Z + shift 2025-09-07T08:15:40.4129512Z + local suite=timm_models 2025-09-07T08:15:40.4129738Z + shift 2025-09-07T08:15:40.4129922Z + local shard_id=6 2025-09-07T08:15:40.4130132Z + shift 2025-09-07T08:15:40.4130310Z + partition_flags=() 2025-09-07T08:15:40.4130539Z + local partition_flags 2025-09-07T08:15:40.4130762Z + [[ -n 7 ]] 2025-09-07T08:15:40.4130951Z + [[ -n 6 ]] 2025-09-07T08:15:40.4131306Z + partition_flags=(--total-partitions "$NUM_TEST_SHARDS" --partition-id "$shard_id") 2025-09-07T08:15:40.4131794Z + [[ inductor_timm_perf_cuda_h100 == *perf_compare* ]] 2025-09-07T08:15:40.4132131Z + [[ inductor_timm_perf_cuda_h100 == *perf* ]] 2025-09-07T08:15:40.4132597Z + test_perf_for_dashboard timm_models --device cuda --total-partitions 7 --partition-id 6 2025-09-07T08:15:40.4134089Z ++ pwd 2025-09-07T08:15:40.4136829Z + TEST_REPORTS_DIR=/var/lib/jenkins/workspace/test/test-reports 2025-09-07T08:15:40.4137185Z + mkdir -p /var/lib/jenkins/workspace/test/test-reports 2025-09-07T08:15:40.4152113Z + local suite=timm_models 2025-09-07T08:15:40.4152357Z + shift 2025-09-07T08:15:40.4152533Z + local backend=inductor 2025-09-07T08:15:40.4152751Z + modes=() 2025-09-07T08:15:40.4152928Z + local modes 2025-09-07T08:15:40.4154016Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *training-true* ]] 2025-09-07T08:15:40.4155308Z + modes+=(training) 2025-09-07T08:15:40.4156411Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *inference-true* ]] 2025-09-07T08:15:40.4157549Z + modes+=(inference) 2025-09-07T08:15:40.4157767Z + targets=('accuracy' 'performance') 2025-09-07T08:15:40.4158005Z + local targets 2025-09-07T08:15:40.4158188Z + local device=cuda 2025-09-07T08:15:40.4158392Z + [[ inductor_timm_perf_cuda_h100 == *cpu* ]] 2025-09-07T08:15:40.4159254Z + [[ inductor_timm_perf_cuda_h100 == *cuda_a10g* ]] 2025-09-07T08:15:40.4159539Z + [[ inductor_timm_perf_cuda_h100 == *h100* ]] 2025-09-07T08:15:40.4159779Z + device=cuda_h100 2025-09-07T08:15:40.4159974Z + for mode in "${modes[@]}" 2025-09-07T08:15:40.4160188Z + [[ training == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T08:15:40.4160430Z + [[ training == \t\r\a\i\n\i\n\g ]] 2025-09-07T08:15:40.4160647Z + dtype=amp 2025-09-07T08:15:40.4160822Z + for target in "${targets[@]}" 2025-09-07T08:15:40.4161044Z + target_flag=('--accuracy') 2025-09-07T08:15:40.4161252Z + local target_flag 2025-09-07T08:15:40.4161684Z + [[ accuracy == \p\e\r\f\o\r\m\a\n\c\e ]] 2025-09-07T08:15:40.4161948Z + [[ accuracy == \a\c\c\u\r\a\c\y ]] 2025-09-07T08:15:40.4162210Z + target_flag+=(--no-translation-validation) 2025-09-07T08:15:40.4163284Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *freezing-true* ]] 2025-09-07T08:15:40.4165323Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *default-true* ]] 2025-09-07T08:15:40.4167309Z + python benchmarks/dynamo/timm_models.py --accuracy --no-translation-validation --training --amp --backend inductor --disable-cudagraphs --device cuda --total-partitions 7 --partition-id 6 --output /var/lib/jenkins/workspace/test/test-reports/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_accuracy.csv 2025-09-07T08:15:41.4291035Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:15:41.4292258Z import pynvml # type: ignore[import] 2025-09-07T08:15:46.7407962Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:15:46.7409658Z import pynvml # type: ignore[import] 2025-09-07T08:15:49.7555268Z 2025-09-07T08:15:50.2340561Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:15:50.2340825Z 2025-09-07T08:15:50.3520608Z model.safetensors: 0% 0.00/130M [00:00 will be ignored 2025-09-07T08:16:35.2349150Z pass 2025-09-07T08:16:39.8596070Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:16:39.8597303Z import pynvml # type: ignore[import] 2025-09-07T08:16:42.8918241Z 2025-09-07T08:16:43.1466328Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:16:43.1467531Z 2025-09-07T08:16:43.2464343Z model.safetensors: 0% 0.00/17.9M [00:00 will be ignored 2025-09-07T08:17:49.5156720Z pass 2025-09-07T08:17:55.0169247Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:17:55.0171195Z import pynvml # type: ignore[import] 2025-09-07T08:17:58.0240461Z 2025-09-07T08:17:59.8378968Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:17:59.8379312Z 2025-09-07T08:17:59.9391825Z model.safetensors: 0% 0.00/353M [00:00 will be ignored 2025-09-07T08:20:03.8680575Z pass 2025-09-07T08:20:10.9384797Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:20:10.9386370Z import pynvml # type: ignore[import] 2025-09-07T08:20:13.9643269Z 2025-09-07T08:20:15.9990538Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:20:15.9990766Z 2025-09-07T08:20:16.1434049Z model.safetensors: 0% 0.00/777M [00:00 will be ignored 2025-09-07T08:24:01.4677246Z pass 2025-09-07T08:24:08.3777419Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:24:08.3779490Z import pynvml # type: ignore[import] 2025-09-07T08:24:11.3632319Z 2025-09-07T08:24:11.7587354Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:24:11.7587667Z 2025-09-07T08:24:12.0566699Z model.safetensors: 0% 0.00/25.0M [00:00 will be ignored 2025-09-07T08:26:58.0782252Z pass 2025-09-07T08:27:05.3993170Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:27:05.3994411Z import pynvml # type: ignore[import] 2025-09-07T08:27:08.4042383Z 2025-09-07T08:27:09.8206622Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:27:09.8206922Z 2025-09-07T08:27:09.9643098Z model.safetensors: 0% 0.00/175M [00:00 will be ignored 2025-09-07T08:29:39.8813427Z pass 2025-09-07T08:29:48.5560930Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:29:48.5562686Z import pynvml # type: ignore[import] 2025-09-07T08:29:51.5608742Z 2025-09-07T08:29:52.5059554Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:29:52.5059832Z 2025-09-07T08:29:52.6155888Z model.safetensors: 0% 0.00/161M [00:00 will be ignored 2025-09-07T08:30:39.2629442Z pass 2025-09-07T08:30:44.1382717Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:30:44.1384131Z import pynvml # type: ignore[import] 2025-09-07T08:30:47.4126557Z 2025-09-07T08:30:48.4433889Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:30:48.4434207Z 2025-09-07T08:30:48.5478464Z model.safetensors: 0% 0.00/346M [00:00 will be ignored 2025-09-07T08:31:31.7052261Z pass 2025-09-07T08:31:36.3259483Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:31:36.3261566Z import pynvml # type: ignore[import] 2025-09-07T08:31:39.4359138Z 2025-09-07T08:31:41.2393900Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:31:41.2394147Z 2025-09-07T08:31:41.3797787Z model.safetensors: 0% 0.00/107M [00:00 will be ignored 2025-09-07T08:33:07.3501586Z pass 2025-09-07T08:33:13.3131503Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:33:13.3132773Z import pynvml # type: ignore[import] 2025-09-07T08:33:16.3377383Z 2025-09-07T08:33:18.8373649Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:33:18.8373974Z 2025-09-07T08:33:18.9476024Z model.safetensors: 0% 0.00/756M [00:00 will be ignored 2025-09-07T08:36:47.9331011Z pass_due_to_skip 2025-09-07T08:36:55.9659072Z accuracy pass_rate=92.31% 2025-09-07T08:36:55.9665805Z calls_captured gmean=1265.66x mean=1732.077x 2025-09-07T08:36:55.9668910Z unique_graphs gmean=2.73x mean=2.769x 2025-09-07T08:36:55.9672123Z graph_breaks gmean=6.76x mean=6.769x 2025-09-07T08:36:55.9676007Z unique_graph_breaks gmean=5.00x mean=5.000x 2025-09-07T08:36:55.9679311Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T08:36:55.9682661Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T08:36:55.9686247Z cudagraph_skips gmean=0.00x mean=0.000x 2025-09-07T08:36:55.9687438Z compilation_latency mean=83.777 seconds 2025-09-07T08:36:56.9951547Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *cudagraphs-true* ]] 2025-09-07T08:36:56.9953863Z + python benchmarks/dynamo/timm_models.py --accuracy --no-translation-validation --training --amp --backend inductor --device cuda --total-partitions 7 --partition-id 6 --output /var/lib/jenkins/workspace/test/test-reports/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_accuracy.csv 2025-09-07T08:36:58.0269367Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:36:58.0270558Z import pynvml # type: ignore[import] 2025-09-07T08:37:03.1412323Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:37:03.1413560Z import pynvml # type: ignore[import] 2025-09-07T08:37:06.1953462Z 2025-09-07T08:37:07.6652057Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:37:07.6652418Z loading model: 0it [00:01, ?it/s] 2025-09-07T08:37:07.6652715Z cuda train selecsls42b 2025-09-07T08:37:32.1810553Z W0907 08:37:32.180000 34574 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:37:49.3170618Z pass 2025-09-07T08:37:54.3125710Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:37:54.3126983Z import pynvml # type: ignore[import] 2025-09-07T08:37:57.4636578Z 2025-09-07T08:37:58.7988605Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:37:58.7988942Z loading model: 0it [00:01, ?it/s] 2025-09-07T08:37:58.7989853Z cuda train spnasnet_100 2025-09-07T08:38:36.2064228Z W0907 08:38:36.205000 34833 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:39:03.5466127Z pass 2025-09-07T08:39:09.2516069Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:39:09.2517280Z import pynvml # type: ignore[import] 2025-09-07T08:39:12.2888671Z 2025-09-07T08:39:14.3938442Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:39:14.3938984Z loading model: 0it [00:02, ?it/s] 2025-09-07T08:39:14.3939484Z cuda train swin_base_patch4_window7_224 2025-09-07T08:40:23.0370796Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/timm_models.py", line 442, in torch_dynamo_resume_in_forward_and_backward_pass_at_440 2025-09-07T08:40:23.0371871Z pred = mod(*cloned_inputs) 2025-09-07T08:40:23.0372425Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 838, in forward 2025-09-07T08:40:23.0372973Z x = self.forward_features(x) 2025-09-07T08:40:23.0373558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 830, in forward_features 2025-09-07T08:40:23.0374116Z x = self.layers(x) 2025-09-07T08:40:23.0374605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 559, in forward 2025-09-07T08:40:23.0375775Z x = self.blocks(x) 2025-09-07T08:40:23.0376246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 406, in forward 2025-09-07T08:40:23.0376829Z x = x + self.drop_path1(self._attn(self.norm1(x))) 2025-09-07T08:40:23.0377420Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 390, in _attn 2025-09-07T08:40:23.0377991Z attn_windows = self.attn(x_windows, mask=attn_mask) # nW*B, window_size*window_size, C 2025-09-07T08:40:23.0378580Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 194, in forward 2025-09-07T08:40:23.0379049Z attn = attn + self._get_rel_pos_bias() 2025-09-07T08:40:23.0379543Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 165, in _get_rel_pos_bias 2025-09-07T08:40:23.0380095Z relative_position_bias = self.relative_position_bias_table[ 2025-09-07T08:40:23.0380318Z 2025-09-07T08:40:23.0380322Z 2025-09-07T08:40:23.4419831Z W0907 08:40:23.441000 35092 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:41:12.6287542Z pass 2025-09-07T08:41:19.6009693Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:41:19.6010925Z import pynvml # type: ignore[import] 2025-09-07T08:41:22.6028688Z 2025-09-07T08:41:25.7049729Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:41:25.7050056Z loading model: 0it [00:03, ?it/s] 2025-09-07T08:41:25.7050350Z cuda train swsl_resnext101_32x16d 2025-09-07T08:42:16.9013230Z pass 2025-09-07T08:42:22.3050646Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:42:22.3052559Z import pynvml # type: ignore[import] 2025-09-07T08:42:25.3053021Z 2025-09-07T08:42:26.9338839Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:42:26.9339390Z loading model: 0it [00:01, ?it/s] 2025-09-07T08:42:26.9339882Z cuda train tf_efficientnet_b0 2025-09-07T08:43:11.0449876Z pass 2025-09-07T08:43:15.8469976Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:43:15.8471230Z import pynvml # type: ignore[import] 2025-09-07T08:43:18.8767202Z 2025-09-07T08:43:20.7996323Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:43:20.7996671Z loading model: 0it [00:01, ?it/s] 2025-09-07T08:43:20.7996988Z cuda train tf_mixnet_l 2025-09-07T08:44:21.5630884Z W0907 08:44:21.562000 35870 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:45:10.0715544Z pass 2025-09-07T08:45:17.6558852Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:45:17.6560096Z import pynvml # type: ignore[import] 2025-09-07T08:45:20.6876257Z 2025-09-07T08:45:22.0336127Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:45:22.0336685Z loading model: 0it [00:01, ?it/s] 2025-09-07T08:45:22.0337168Z cuda train tinynet_a 2025-09-07T08:46:08.7041400Z pass 2025-09-07T08:46:13.6389431Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:46:13.6390656Z import pynvml # type: ignore[import] 2025-09-07T08:46:16.6350693Z 2025-09-07T08:46:20.0169680Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:46:20.0170003Z loading model: 0it [00:03, ?it/s] 2025-09-07T08:46:20.0170256Z cuda train tnt_s_patch16_224 2025-09-07T08:47:13.7689657Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/timm_models.py", line 442, in torch_dynamo_resume_in_forward_and_backward_pass_at_440 2025-09-07T08:47:13.7690760Z pred = mod(*cloned_inputs) 2025-09-07T08:47:13.7691260Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/tnt.py", line 335, in forward 2025-09-07T08:47:13.7691750Z x = self.forward_features(x) 2025-09-07T08:47:13.7692303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/tnt.py", line 311, in forward_features 2025-09-07T08:47:13.7692902Z pixel_embed = self.pixel_embed(x, self.pixel_pos) 2025-09-07T08:47:13.7693473Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/tnt.py", line 183, in forward 2025-09-07T08:47:13.7693905Z x = self.unfold(x) 2025-09-07T08:47:13.7694027Z 2025-09-07T08:47:13.7694031Z 2025-09-07T08:47:14.1392632Z W0907 08:47:14.138000 36388 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:48:08.2833890Z pass 2025-09-07T08:48:14.8348161Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:48:14.8350101Z import pynvml # type: ignore[import] 2025-09-07T08:48:17.8991789Z 2025-09-07T08:48:24.1037719Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:48:24.1038090Z loading model: 0it [00:06, ?it/s] 2025-09-07T08:48:24.1038372Z cuda train twins_pcpvt_base 2025-09-07T08:49:27.0696057Z W0907 08:49:27.068000 36648 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:50:56.8970315Z pass 2025-09-07T08:51:04.7113279Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:51:04.7114483Z import pynvml # type: ignore[import] 2025-09-07T08:51:07.7767816Z 2025-09-07T08:51:09.3058674Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:51:09.3059169Z loading model: 0it [00:01, ?it/s] 2025-09-07T08:51:09.3059632Z cuda train visformer_small 2025-09-07T08:51:33.5299995Z W0907 08:51:33.529000 36907 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:51:49.9039017Z pass 2025-09-07T08:51:54.6756007Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:51:54.6757322Z import pynvml # type: ignore[import] 2025-09-07T08:51:57.7184862Z 2025-09-07T08:51:59.3015297Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:51:59.3015669Z loading model: 0it [00:01, ?it/s] 2025-09-07T08:51:59.3015985Z cuda train vit_base_patch16_224 2025-09-07T08:52:19.0250581Z W0907 08:52:19.023000 37167 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:52:40.0687722Z pass 2025-09-07T08:52:44.5684631Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:52:44.5686291Z import pynvml # type: ignore[import] 2025-09-07T08:52:47.6034330Z 2025-09-07T08:52:49.2290901Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:52:49.2291277Z loading model: 0it [00:01, ?it/s] 2025-09-07T08:52:49.2291576Z cuda train volo_d1_224 2025-09-07T08:53:14.0318154Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/timm_models.py", line 442, in torch_dynamo_resume_in_forward_and_backward_pass_at_440 2025-09-07T08:53:14.0319262Z pred = mod(*cloned_inputs) 2025-09-07T08:53:14.0319817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 822, in forward 2025-09-07T08:53:14.0320319Z x = self.forward_features(x) 2025-09-07T08:53:14.0320840Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 795, in forward_features 2025-09-07T08:53:14.0321370Z x = self.forward_tokens(x) 2025-09-07T08:53:14.0322398Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 642, in forward_tokens 2025-09-07T08:53:14.0322898Z x = block(x) 2025-09-07T08:53:14.0323319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 135, in forward 2025-09-07T08:53:14.0323837Z x = x + self.drop_path1(self.attn(self.norm1(x))) 2025-09-07T08:53:14.0324346Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 90, in forward 2025-09-07T08:53:14.0325841Z x = F.fold(x, output_size=(H, W), kernel_size=self.kernel_size, padding=self.padding, stride=self.stride) 2025-09-07T08:53:14.0326197Z 2025-09-07T08:53:14.0326201Z 2025-09-07T08:53:34.4701740Z W0907 08:53:34.469000 37426 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:54:11.3019702Z pass 2025-09-07T08:54:16.8422299Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:54:16.8423999Z import pynvml # type: ignore[import] 2025-09-07T08:54:19.9162804Z 2025-09-07T08:54:23.0840564Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:54:23.0840893Z loading model: 0it [00:03, ?it/s] 2025-09-07T08:54:23.0841176Z cuda train xcit_large_24_p8_224 2025-09-07T08:55:58.9268354Z W0907 08:55:58.925000 37686 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:57:53.0920080Z pass_due_to_skip 2025-09-07T08:58:02.1900412Z accuracy pass_rate=92.31% 2025-09-07T08:58:02.1912702Z calls_captured gmean=1265.66x mean=1732.077x 2025-09-07T08:58:02.1913061Z unique_graphs gmean=2.73x mean=2.769x 2025-09-07T08:58:02.1914238Z graph_breaks gmean=6.76x mean=6.769x 2025-09-07T08:58:02.1918056Z unique_graph_breaks gmean=5.00x mean=5.000x 2025-09-07T08:58:02.1921309Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T08:58:02.1924597Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T08:58:02.1928535Z cudagraph_skips gmean=0.00x mean=0.231x 2025-09-07T08:58:02.1929753Z compilation_latency mean=82.531 seconds 2025-09-07T08:58:03.2259652Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *dynamic-true* ]] 2025-09-07T08:58:03.2262619Z + python benchmarks/dynamo/timm_models.py --accuracy --no-translation-validation --training --amp --backend inductor --dynamic-shapes --dynamic-batch-only --device cuda --total-partitions 7 --partition-id 6 --output /var/lib/jenkins/workspace/test/test-reports/inductor_dynamic_timm_models_amp_training_cuda_h100_accuracy.csv 2025-09-07T08:58:04.2745710Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:58:04.2747012Z import pynvml # type: ignore[import] 2025-09-07T08:58:09.2255573Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:58:09.2256804Z import pynvml # type: ignore[import] 2025-09-07T08:58:12.2759970Z 2025-09-07T08:58:13.8511417Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:58:13.8511748Z loading model: 0it [00:01, ?it/s] 2025-09-07T08:58:13.8512053Z cuda train selecsls42b 2025-09-07T08:58:23.6070010Z W0907 08:58:23.606000 37996 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:58:27.5268539Z pass 2025-09-07T08:58:30.9659501Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:58:30.9661548Z import pynvml # type: ignore[import] 2025-09-07T08:58:33.9711824Z 2025-09-07T08:58:35.3094177Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:58:35.3094578Z loading model: 0it [00:01, ?it/s] 2025-09-07T08:58:35.3094904Z cuda train spnasnet_100 2025-09-07T08:58:44.8146764Z W0907 08:58:44.813000 38333 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:58:51.3353904Z pass 2025-09-07T08:58:54.9957234Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:58:54.9958833Z import pynvml # type: ignore[import] 2025-09-07T08:58:57.9828348Z 2025-09-07T08:59:00.9613757Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:59:00.9614176Z loading model: 0it [00:02, ?it/s] 2025-09-07T08:59:00.9614491Z cuda train swin_base_patch4_window7_224 2025-09-07T08:59:12.1247679Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/timm_models.py", line 442, in torch_dynamo_resume_in_forward_and_backward_pass_at_440 2025-09-07T08:59:12.1248670Z pred = mod(*cloned_inputs) 2025-09-07T08:59:12.1249161Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 838, in forward 2025-09-07T08:59:12.1249655Z x = self.forward_features(x) 2025-09-07T08:59:12.1250159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 830, in forward_features 2025-09-07T08:59:12.1250650Z x = self.layers(x) 2025-09-07T08:59:12.1251062Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 559, in forward 2025-09-07T08:59:12.1251501Z x = self.blocks(x) 2025-09-07T08:59:12.1251915Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 406, in forward 2025-09-07T08:59:12.1252415Z x = x + self.drop_path1(self._attn(self.norm1(x))) 2025-09-07T08:59:12.1252917Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 390, in _attn 2025-09-07T08:59:12.1253504Z attn_windows = self.attn(x_windows, mask=attn_mask) # nW*B, window_size*window_size, C 2025-09-07T08:59:12.1254093Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 194, in forward 2025-09-07T08:59:12.1254567Z attn = attn + self._get_rel_pos_bias() 2025-09-07T08:59:12.1255441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 165, in _get_rel_pos_bias 2025-09-07T08:59:12.1256019Z relative_position_bias = self.relative_position_bias_table[ 2025-09-07T08:59:12.1256243Z 2025-09-07T08:59:12.1256247Z 2025-09-07T08:59:13.6526050Z W0907 08:59:13.651000 38603 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T08:59:22.9877740Z pass 2025-09-07T08:59:26.7091777Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:59:26.7093984Z import pynvml # type: ignore[import] 2025-09-07T08:59:29.8067519Z 2025-09-07T08:59:33.5665627Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:59:33.5665996Z loading model: 0it [00:03, ?it/s] 2025-09-07T08:59:33.5667447Z cuda train swsl_resnext101_32x16d 2025-09-07T08:59:46.1552781Z pass 2025-09-07T08:59:49.5337831Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T08:59:49.5339100Z import pynvml # type: ignore[import] 2025-09-07T08:59:52.5858497Z 2025-09-07T08:59:54.0383580Z loading model: 0it [00:00, ?it/s] 2025-09-07T08:59:54.0383965Z loading model: 0it [00:01, ?it/s] 2025-09-07T08:59:54.0384804Z cuda train tf_efficientnet_b0 2025-09-07T09:00:04.5834682Z pass 2025-09-07T09:00:08.0453381Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:00:08.0454487Z import pynvml # type: ignore[import] 2025-09-07T09:00:11.0806646Z 2025-09-07T09:00:13.1218333Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:00:13.1218693Z loading model: 0it [00:02, ?it/s] 2025-09-07T09:00:13.1219037Z cuda train tf_mixnet_l 2025-09-07T09:00:25.9819890Z W0907 09:00:25.981000 39413 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:00:36.6266545Z pass 2025-09-07T09:00:40.5320109Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:00:40.5321705Z import pynvml # type: ignore[import] 2025-09-07T09:00:43.5328384Z 2025-09-07T09:00:45.3207990Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:00:45.3208368Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:00:45.3208677Z cuda train tinynet_a 2025-09-07T09:00:56.6272038Z pass 2025-09-07T09:01:00.1706518Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:01:00.1708340Z import pynvml # type: ignore[import] 2025-09-07T09:01:03.2939477Z 2025-09-07T09:01:05.3583158Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:01:05.3583593Z loading model: 0it [00:02, ?it/s] 2025-09-07T09:01:05.3585764Z cuda train tnt_s_patch16_224 2025-09-07T09:01:15.1307288Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/timm_models.py", line 442, in torch_dynamo_resume_in_forward_and_backward_pass_at_440 2025-09-07T09:01:15.1308390Z pred = mod(*cloned_inputs) 2025-09-07T09:01:15.1308925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/tnt.py", line 335, in forward 2025-09-07T09:01:15.1309462Z x = self.forward_features(x) 2025-09-07T09:01:15.1310000Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/tnt.py", line 311, in forward_features 2025-09-07T09:01:15.1310566Z pixel_embed = self.pixel_embed(x, self.pixel_pos) 2025-09-07T09:01:15.1311104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/tnt.py", line 183, in forward 2025-09-07T09:01:15.1312063Z x = self.unfold(x) 2025-09-07T09:01:15.1312197Z 2025-09-07T09:01:15.1312200Z 2025-09-07T09:01:16.9608762Z W0907 09:01:16.960000 39953 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:01:27.3533551Z pass 2025-09-07T09:01:31.1453543Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:01:31.1455688Z import pynvml # type: ignore[import] 2025-09-07T09:01:34.3165955Z 2025-09-07T09:01:39.7582487Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:01:39.7583028Z loading model: 0it [00:05, ?it/s] 2025-09-07T09:01:39.7583908Z cuda train twins_pcpvt_base 2025-09-07T09:01:52.3615633Z W0907 09:01:52.360000 40223 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:02:11.8495645Z pass 2025-09-07T09:02:15.7202168Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:02:15.7204031Z import pynvml # type: ignore[import] 2025-09-07T09:02:18.7016711Z 2025-09-07T09:02:21.0146913Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:02:21.0147271Z loading model: 0it [00:02, ?it/s] 2025-09-07T09:02:21.0147562Z cuda train visformer_small 2025-09-07T09:02:28.9871333Z W0907 09:02:28.986000 40493 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:02:32.7638905Z pass 2025-09-07T09:02:36.2079454Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:02:36.2080623Z import pynvml # type: ignore[import] 2025-09-07T09:02:39.2041341Z 2025-09-07T09:02:40.7669293Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:02:40.7669619Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:02:40.7669919Z cuda train vit_base_patch16_224 2025-09-07T09:02:47.7874848Z W0907 09:02:47.786000 40763 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:02:52.4461083Z pass 2025-09-07T09:02:55.9382903Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:02:55.9384422Z import pynvml # type: ignore[import] 2025-09-07T09:02:59.1400307Z 2025-09-07T09:03:00.6412909Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:03:00.6413285Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:03:00.6413608Z cuda train volo_d1_224 2025-09-07T09:03:08.8069129Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/timm_models.py", line 442, in torch_dynamo_resume_in_forward_and_backward_pass_at_440 2025-09-07T09:03:08.8070321Z pred = mod(*cloned_inputs) 2025-09-07T09:03:08.8070841Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 822, in forward 2025-09-07T09:03:08.8071348Z x = self.forward_features(x) 2025-09-07T09:03:08.8071865Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 795, in forward_features 2025-09-07T09:03:08.8072916Z x = self.forward_tokens(x) 2025-09-07T09:03:08.8073466Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 642, in forward_tokens 2025-09-07T09:03:08.8074066Z x = block(x) 2025-09-07T09:03:08.8074468Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 135, in forward 2025-09-07T09:03:08.8075436Z x = x + self.drop_path1(self.attn(self.norm1(x))) 2025-09-07T09:03:08.8076210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 90, in forward 2025-09-07T09:03:08.8076897Z x = F.fold(x, output_size=(H, W), kernel_size=self.kernel_size, padding=self.padding, stride=self.stride) 2025-09-07T09:03:08.8077265Z 2025-09-07T09:03:08.8077268Z 2025-09-07T09:03:09.6916633Z W0907 09:03:09.690000 41033 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:03:16.8634244Z pass 2025-09-07T09:03:20.3538859Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:03:20.3540097Z import pynvml # type: ignore[import] 2025-09-07T09:03:23.3740354Z 2025-09-07T09:03:26.7772926Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:03:26.7773276Z loading model: 0it [00:03, ?it/s] 2025-09-07T09:03:26.7773616Z cuda train xcit_large_24_p8_224 2025-09-07T09:03:42.5294719Z W0907 09:03:42.528000 41303 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:04:07.4897244Z pass_due_to_skip 2025-09-07T09:04:10.5046119Z accuracy pass_rate=92.31% 2025-09-07T09:04:10.5050253Z calls_captured gmean=1265.66x mean=1732.077x 2025-09-07T09:04:10.5053156Z unique_graphs gmean=2.73x mean=2.769x 2025-09-07T09:04:10.5056894Z graph_breaks gmean=6.76x mean=6.769x 2025-09-07T09:04:10.5060306Z unique_graph_breaks gmean=5.00x mean=5.000x 2025-09-07T09:04:10.5063849Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T09:04:10.5067407Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T09:04:10.5070658Z cudagraph_skips gmean=0.00x mean=0.231x 2025-09-07T09:04:10.5071790Z compilation_latency mean=15.778 seconds 2025-09-07T09:04:11.5508907Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *cppwrapper-true* ]] 2025-09-07T09:04:11.5510647Z + TORCHINDUCTOR_CPP_WRAPPER=1 2025-09-07T09:04:11.5512401Z + python benchmarks/dynamo/timm_models.py --accuracy --no-translation-validation --training --amp --backend inductor --disable-cudagraphs --device cuda --total-partitions 7 --partition-id 6 --output /var/lib/jenkins/workspace/test/test-reports/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_accuracy.csv 2025-09-07T09:04:12.6356441Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:04:12.6357736Z import pynvml # type: ignore[import] 2025-09-07T09:04:17.5673085Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:04:17.5674513Z import pynvml # type: ignore[import] 2025-09-07T09:04:20.5956902Z 2025-09-07T09:04:22.2190369Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:04:22.2190905Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:04:22.2191373Z cuda train selecsls42b 2025-09-07T09:05:13.0580961Z W0907 09:05:13.057000 41660 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:05:38.8798854Z pass 2025-09-07T09:05:44.0268178Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:05:44.0269498Z import pynvml # type: ignore[import] 2025-09-07T09:05:47.0688496Z 2025-09-07T09:05:48.4230891Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:05:48.4231220Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:05:48.4231505Z cuda train spnasnet_100 2025-09-07T09:07:07.3612956Z W0907 09:07:07.360000 43078 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:07:47.9829168Z pass 2025-09-07T09:07:54.3836891Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:07:54.3838111Z import pynvml # type: ignore[import] 2025-09-07T09:07:57.3867292Z 2025-09-07T09:08:01.4260254Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:08:01.4260638Z loading model: 0it [00:04, ?it/s] 2025-09-07T09:08:01.4260947Z cuda train swin_base_patch4_window7_224 2025-09-07T09:10:29.5374515Z W0907 09:10:29.536000 43859 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:11:46.5744724Z pass 2025-09-07T09:11:54.9450674Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:11:54.9452062Z import pynvml # type: ignore[import] 2025-09-07T09:11:58.1116875Z 2025-09-07T09:12:01.1079231Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:12:01.1079585Z loading model: 0it [00:02, ?it/s] 2025-09-07T09:12:01.1079923Z cuda train swsl_resnext101_32x16d 2025-09-07T09:13:41.5219042Z pass 2025-09-07T09:13:47.2827508Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:13:47.2828760Z import pynvml # type: ignore[import] 2025-09-07T09:13:50.2917992Z 2025-09-07T09:13:51.5496453Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:13:51.5496822Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:13:51.5497134Z cuda train tf_efficientnet_b0 2025-09-07T09:15:18.7403492Z pass 2025-09-07T09:15:24.2496353Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:15:24.2498329Z import pynvml # type: ignore[import] 2025-09-07T09:15:27.2753587Z 2025-09-07T09:15:28.6477231Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:15:28.6477570Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:15:28.6477854Z cuda train tf_mixnet_l 2025-09-07T09:17:33.6113520Z W0907 09:17:33.610000 46459 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:18:48.6581008Z pass 2025-09-07T09:18:57.3478761Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:18:57.3480006Z import pynvml # type: ignore[import] 2025-09-07T09:19:00.5389680Z 2025-09-07T09:19:01.9404273Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:19:01.9404714Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:19:01.9405511Z cuda train tinynet_a 2025-09-07T09:20:47.7550113Z pass 2025-09-07T09:20:53.3508804Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:20:53.3510001Z import pynvml # type: ignore[import] 2025-09-07T09:20:56.5133631Z 2025-09-07T09:20:57.8487670Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:20:57.8488032Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:20:57.8488321Z cuda train tnt_s_patch16_224 2025-09-07T09:23:03.9212621Z W0907 09:23:03.920000 47637 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:24:31.0132252Z pass 2025-09-07T09:24:40.0100733Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:24:40.0102043Z import pynvml # type: ignore[import] 2025-09-07T09:24:43.1096241Z 2025-09-07T09:24:44.6306818Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:24:44.6307258Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:24:44.6307650Z cuda train twins_pcpvt_base 2025-09-07T09:27:32.3315854Z W0907 09:27:32.330000 48451 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:29:44.2434508Z pass 2025-09-07T09:29:53.9513048Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:29:53.9514273Z import pynvml # type: ignore[import] 2025-09-07T09:29:56.9720294Z 2025-09-07T09:29:58.3271676Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:29:58.3271982Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:29:58.3272264Z cuda train visformer_small 2025-09-07T09:30:43.7564306Z W0907 09:30:43.755000 48976 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:31:09.1023296Z pass 2025-09-07T09:31:14.2060619Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:31:14.2062142Z import pynvml # type: ignore[import] 2025-09-07T09:31:17.1981644Z 2025-09-07T09:31:18.7127349Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:31:18.7127896Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:31:18.7128379Z cuda train vit_base_patch16_224 2025-09-07T09:31:56.3557832Z W0907 09:31:56.354000 49598 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:32:25.7463370Z pass 2025-09-07T09:32:30.7434828Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:32:30.7436255Z import pynvml # type: ignore[import] 2025-09-07T09:32:33.8412213Z 2025-09-07T09:32:35.1627633Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:32:35.1628645Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:32:35.1628992Z cuda train volo_d1_224 2025-09-07T09:34:13.6123511Z W0907 09:34:13.611000 50123 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:35:09.4445915Z pass 2025-09-07T09:35:16.4229677Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:35:16.4231174Z import pynvml # type: ignore[import] 2025-09-07T09:35:19.5100664Z 2025-09-07T09:35:22.6782196Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:35:22.6782556Z loading model: 0it [00:03, ?it/s] 2025-09-07T09:35:22.6782880Z cuda train xcit_large_24_p8_224 2025-09-07T09:39:40.8923429Z W0907 09:39:40.891000 50777 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:42:28.3893181Z pass_due_to_skip 2025-09-07T09:42:39.6659437Z accuracy pass_rate=92.31% 2025-09-07T09:42:39.6667008Z calls_captured gmean=1265.66x mean=1732.077x 2025-09-07T09:42:39.6670465Z unique_graphs gmean=2.73x mean=2.769x 2025-09-07T09:42:39.6674338Z graph_breaks gmean=6.76x mean=6.769x 2025-09-07T09:42:39.6678593Z unique_graph_breaks gmean=5.00x mean=5.000x 2025-09-07T09:42:39.6682421Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T09:42:39.6686451Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T09:42:39.6690189Z cudagraph_skips gmean=0.00x mean=0.000x 2025-09-07T09:42:39.6691473Z compilation_latency mean=163.348 seconds 2025-09-07T09:42:40.7751409Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *freezing_cudagraphs-true* ]] 2025-09-07T09:42:40.7752697Z + [[ training == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T09:42:40.7753926Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *freeze_autotune_cudagraphs-true* ]] 2025-09-07T09:42:40.7755292Z + [[ training == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T09:42:40.7756439Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *aotinductor-true* ]] 2025-09-07T09:42:40.7757525Z + [[ training == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T09:42:40.7758579Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *maxautotune-true* ]] 2025-09-07T09:42:40.7759644Z + TORCHINDUCTOR_MAX_AUTOTUNE=1 2025-09-07T09:42:40.7760743Z + python benchmarks/dynamo/timm_models.py --accuracy --no-translation-validation --training --amp --backend inductor --device cuda --total-partitions 7 --partition-id 6 --output /var/lib/jenkins/workspace/test/test-reports/inductor_max_autotune_timm_models_amp_training_cuda_h100_accuracy.csv 2025-09-07T09:42:41.7984578Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:42:41.7986413Z import pynvml # type: ignore[import] 2025-09-07T09:42:47.0100051Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:42:47.0101467Z import pynvml # type: ignore[import] 2025-09-07T09:42:50.5058446Z 2025-09-07T09:42:51.9129075Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:42:51.9129473Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:42:51.9129815Z cuda train selecsls42b 2025-09-07T09:43:11.3210588Z Autotune Choices Stats: 2025-09-07T09:43:11.3211826Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_5", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8", "best_time": 0.013376000337302685, "best_triton_pos": 0} 2025-09-07T09:43:11.5468086Z AUTOTUNE convolution(8x3x224x224, 32x3x3x3) 2025-09-07T09:43:11.5468386Z strides: [150528, 50176, 224, 1], [27, 9, 3, 1] 2025-09-07T09:43:11.5468666Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:11.5469356Z triton_convolution2d_5 0.0134 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:11.5470447Z triton_convolution2d_1 0.0137 ms 97.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:11.5471524Z triton_convolution2d_3 0.0149 ms 89.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:11.5472595Z triton_convolution2d_0 0.0162 ms 82.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:11.5473639Z triton_convolution2d_4 0.0177 ms 75.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:11.5474284Z convolution 0.0229 ms 58.5% 2025-09-07T09:43:11.5474908Z triton_convolution2d_2 0.0250 ms 53.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:11.5476305Z SingleProcess AUTOTUNE benchmarking takes 0.8016 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T09:43:11.6434931Z Autotune Choices Stats: 2025-09-07T09:43:11.6436137Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_11", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8", "best_time": 0.016896000131964684, "best_triton_pos": 0} 2025-09-07T09:43:11.7631131Z AUTOTUNE convolution(8x32x112x112, 64x32x3x3) 2025-09-07T09:43:11.7631681Z strides: [401408, 12544, 112, 1], [288, 9, 3, 1] 2025-09-07T09:43:11.7632174Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:11.7633386Z triton_convolution2d_11 0.0169 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:11.7645438Z triton_convolution2d_10 0.0171 ms 98.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:11.7646916Z triton_convolution2d_9 0.0174 ms 97.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:11.7648154Z triton_convolution2d_12 0.0184 ms 91.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:11.7649374Z triton_convolution2d_6 0.0192 ms 88.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:11.7650601Z triton_convolution2d_7 0.0214 ms 79.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:11.7651350Z convolution 0.0320 ms 52.9% 2025-09-07T09:43:11.7652083Z triton_convolution2d_8 0.0707 ms 23.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:11.7653043Z SingleProcess AUTOTUNE benchmarking takes 0.2158 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:43:12.3082230Z Autotune Choices Stats: 2025-09-07T09:43:12.3083324Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_18", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8", "best_time": 0.009279999881982803, "best_triton_pos": 0} 2025-09-07T09:43:12.5889111Z AUTOTUNE convolution(8x64x56x56, 64x64x1x1) 2025-09-07T09:43:12.5889675Z strides: [200704, 3136, 56, 1], [64, 1, 1, 1] 2025-09-07T09:43:12.5890147Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:12.5891418Z triton_convolution2d_18 0.0093 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:12.5893422Z triton_convolution2d_17 0.0094 ms 98.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:12.5895718Z triton_convolution2d_13 0.0099 ms 93.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:12.5896953Z triton_convolution2d_16 0.0100 ms 92.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:12.5897709Z convolution 0.0102 ms 90.9% 2025-09-07T09:43:12.5898431Z triton_convolution2d_19 0.0110 ms 84.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:12.5899646Z triton_convolution2d_14 0.0118 ms 78.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:12.7296478Z triton_convolution2d_15 0.0124 ms 74.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:12.7297236Z conv1x1_via_mm 0.0476 ms 19.5% 2025-09-07T09:43:12.7297721Z SingleProcess AUTOTUNE benchmarking takes 0.8248 seconds and 0.0004 seconds precompiling for 9 choices 2025-09-07T09:43:12.7716511Z Autotune Choices Stats: 2025-09-07T09:43:12.7717823Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_24", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.023840000852942467, "best_triton_pos": 0} 2025-09-07T09:43:12.7874867Z AUTOTUNE convolution(8x64x56x56, 32x64x3x3) 2025-09-07T09:43:12.7875388Z strides: [200704, 3136, 56, 1], [576, 9, 3, 1] 2025-09-07T09:43:12.7875704Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:12.7876458Z triton_convolution2d_24 0.0238 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:12.7877685Z triton_convolution2d_20 0.0249 ms 95.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:12.7878929Z triton_convolution2d_23 0.0254 ms 93.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:12.7880144Z triton_convolution2d_25 0.0254 ms 93.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:12.7881358Z triton_convolution2d_26 0.0260 ms 91.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:12.7882106Z convolution 0.0285 ms 83.6% 2025-09-07T09:43:12.7882821Z triton_convolution2d_21 0.0298 ms 79.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:12.7884045Z triton_convolution2d_22 0.0723 ms 33.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:12.7885150Z SingleProcess AUTOTUNE benchmarking takes 0.1980 seconds and 0.0003 seconds precompiling for 8 choices 2025-09-07T09:43:13.1124302Z Autotune Choices Stats: 2025-09-07T09:43:13.1125812Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_32", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8", "best_time": 0.0081599997356534, "best_triton_pos": 0} 2025-09-07T09:43:13.1584166Z AUTOTUNE convolution(8x32x56x56, 64x32x1x1) 2025-09-07T09:43:13.1584720Z strides: [100352, 3136, 56, 1], [32, 1, 1, 1] 2025-09-07T09:43:13.1585486Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:13.1586248Z triton_convolution2d_32 0.0082 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:13.1587479Z triton_convolution2d_30 0.0082 ms 99.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:13.1588968Z triton_convolution2d_31 0.0084 ms 97.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:13.1590174Z triton_convolution2d_33 0.0090 ms 90.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:13.1591525Z triton_convolution2d_27 0.0092 ms 88.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:13.1592262Z convolution 0.0093 ms 87.9% 2025-09-07T09:43:13.1592974Z triton_convolution2d_28 0.0096 ms 85.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:13.1594205Z triton_convolution2d_29 0.0099 ms 82.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:13.1595094Z conv1x1_via_mm 0.0388 ms 21.0% 2025-09-07T09:43:13.1595540Z SingleProcess AUTOTUNE benchmarking takes 0.3704 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:43:13.8023966Z Autotune Choices Stats: 2025-09-07T09:43:13.8025565Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_46", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8", "best_time": 0.011168000288307667, "best_triton_pos": 0} 2025-09-07T09:43:13.8431729Z AUTOTUNE convolution(8x128x56x56, 64x128x1x1) 2025-09-07T09:43:13.8432091Z strides: [401408, 3136, 56, 1], [128, 1, 1, 1] 2025-09-07T09:43:13.8432385Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:13.8433148Z triton_convolution2d_46 0.0112 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:13.8434375Z triton_convolution2d_45 0.0113 ms 98.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:13.8435374Z convolution 0.0113 ms 98.6% 2025-09-07T09:43:13.8436107Z triton_convolution2d_44 0.0120 ms 93.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:13.8437252Z triton_convolution2d_41 0.0126 ms 88.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:13.8438387Z triton_convolution2d_47 0.0135 ms 82.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:13.8439510Z triton_convolution2d_42 0.0155 ms 72.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:13.8440665Z triton_convolution2d_43 0.0169 ms 66.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:13.8441365Z conv1x1_via_mm 0.0556 ms 20.1% 2025-09-07T09:43:13.8441805Z SingleProcess AUTOTUNE benchmarking takes 0.6838 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:43:14.1832807Z Autotune Choices Stats: 2025-09-07T09:43:14.1833921Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_52", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.023903999477624893, "best_triton_pos": 0} 2025-09-07T09:43:14.2239453Z AUTOTUNE convolution(8x64x56x56, 64x64x3x3) 2025-09-07T09:43:14.2240636Z strides: [200704, 3136, 56, 1], [576, 9, 3, 1] 2025-09-07T09:43:14.2241176Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:14.2242422Z triton_convolution2d_52 0.0239 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:14.2244408Z triton_convolution2d_54 0.0241 ms 99.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:14.2246949Z triton_convolution2d_51 0.0247 ms 96.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:14.2248113Z triton_convolution2d_53 0.0254 ms 94.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:14.2249236Z triton_convolution2d_48 0.0283 ms 84.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:14.2249932Z convolution 0.0315 ms 75.9% 2025-09-07T09:43:14.2250607Z triton_convolution2d_49 0.0336 ms 71.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:14.2251749Z triton_convolution2d_50 0.0691 ms 34.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:14.2252655Z SingleProcess AUTOTUNE benchmarking takes 0.3802 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:43:14.5785966Z Autotune Choices Stats: 2025-09-07T09:43:14.5787328Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.012415999546647072, "best_triton_pos": 1, "best_triton_time": 0.014047999866306782, "best_triton_kernel": "triton_convolution2d_88", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T09:43:14.6093257Z AUTOTUNE convolution(8x192x56x56, 128x192x1x1) 2025-09-07T09:43:14.6093833Z strides: [602112, 3136, 56, 1], [192, 1, 1, 1] 2025-09-07T09:43:14.6094308Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:14.6094758Z convolution 0.0124 ms 100.0% 2025-09-07T09:43:14.6096111Z triton_convolution2d_88 0.0140 ms 88.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:14.6097357Z triton_convolution2d_87 0.0143 ms 87.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:14.6098611Z triton_convolution2d_86 0.0156 ms 79.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:14.6100174Z triton_convolution2d_83 0.0164 ms 75.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:14.6101466Z triton_convolution2d_89 0.0166 ms 74.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:14.6102886Z triton_convolution2d_84 0.0200 ms 62.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:14.6104110Z triton_convolution2d_85 0.0230 ms 54.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:14.6104866Z conv1x1_via_mm 0.0819 ms 15.2% 2025-09-07T09:43:14.6105534Z SingleProcess AUTOTUNE benchmarking takes 0.3828 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:43:15.1150186Z Autotune Choices Stats: 2025-09-07T09:43:15.1151604Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.034015998244285583, "best_triton_pos": 1, "best_triton_time": 0.051072001457214355, "best_triton_kernel": "triton_convolution2d_95", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T09:43:15.2670768Z AUTOTUNE convolution(8x128x56x56, 144x128x3x3) 2025-09-07T09:43:15.2671112Z strides: [401408, 3136, 56, 1], [1152, 9, 3, 1] 2025-09-07T09:43:15.2671405Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:15.2671690Z convolution 0.0340 ms 100.0% 2025-09-07T09:43:15.2672432Z triton_convolution2d_95 0.0511 ms 66.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:15.2673670Z triton_convolution2d_96 0.0524 ms 64.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:15.2674899Z triton_convolution2d_94 0.0537 ms 63.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:15.2676476Z triton_convolution2d_93 0.0626 ms 54.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:15.2677672Z triton_convolution2d_91 0.0628 ms 54.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:15.2678797Z triton_convolution2d_90 0.0757 ms 44.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:15.2679933Z triton_convolution2d_92 0.1849 ms 18.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:15.2680841Z SingleProcess AUTOTUNE benchmarking takes 0.6563 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:43:15.4155116Z Autotune Choices Stats: 2025-09-07T09:43:15.4156426Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_101", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.010944000445306301, "best_triton_pos": 0} 2025-09-07T09:43:15.4373932Z AUTOTUNE convolution(8x144x28x28, 144x144x1x1) 2025-09-07T09:43:15.4374284Z strides: [112896, 784, 28, 1], [144, 1, 1, 1] 2025-09-07T09:43:15.4374593Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:15.4375866Z triton_convolution2d_101 0.0109 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:15.4377151Z triton_convolution2d_100 0.0123 ms 89.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:15.4378399Z triton_convolution2d_102 0.0125 ms 87.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:15.4379650Z triton_convolution2d_103 0.0128 ms 85.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:15.4380896Z triton_convolution2d_97 0.0147 ms 74.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:15.4382239Z triton_convolution2d_98 0.0151 ms 72.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:15.4383471Z triton_convolution2d_99 0.0165 ms 66.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:15.4384233Z convolution 0.0948 ms 11.5% 2025-09-07T09:43:15.4384487Z conv1x1_via_mm 0.4173 ms 2.6% 2025-09-07T09:43:15.4385139Z SingleProcess AUTOTUNE benchmarking takes 0.1685 seconds and 0.0003 seconds precompiling for 9 choices 2025-09-07T09:43:15.8826908Z Autotune Choices Stats: 2025-09-07T09:43:15.8828338Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03203200176358223, "best_triton_pos": 1, "best_triton_time": 0.040063999593257904, "best_triton_kernel": "triton_convolution2d_109", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T09:43:15.9300010Z AUTOTUNE convolution(8x144x28x28, 72x144x3x3) 2025-09-07T09:43:15.9300374Z strides: [112896, 784, 28, 1], [1296, 9, 3, 1] 2025-09-07T09:43:15.9300680Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:15.9300976Z convolution 0.0320 ms 100.0% 2025-09-07T09:43:15.9301897Z triton_convolution2d_109 0.0401 ms 80.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:15.9303160Z triton_convolution2d_108 0.0461 ms 69.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:15.9304393Z triton_convolution2d_107 0.0465 ms 68.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:15.9306233Z triton_convolution2d_104 0.0494 ms 64.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:15.9307626Z triton_convolution2d_110 0.0545 ms 58.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:15.9309184Z triton_convolution2d_105 0.0674 ms 47.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:15.9310449Z triton_convolution2d_106 0.1288 ms 24.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:15.9311443Z SingleProcess AUTOTUNE benchmarking takes 0.4916 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:43:16.3218515Z Autotune Choices Stats: 2025-09-07T09:43:16.3219930Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.00902399979531765, "best_triton_pos": 1, "best_triton_time": 0.009151999838650227, "best_triton_kernel": "triton_convolution2d_115", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:43:16.3551139Z AUTOTUNE convolution(8x72x28x28, 144x72x1x1) 2025-09-07T09:43:16.3551447Z strides: [56448, 784, 28, 1], [72, 1, 1, 1] 2025-09-07T09:43:16.3551729Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:16.3552021Z convolution 0.0090 ms 100.0% 2025-09-07T09:43:16.3552750Z triton_convolution2d_115 0.0092 ms 98.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:16.3553986Z triton_convolution2d_114 0.0103 ms 87.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:16.3555576Z triton_convolution2d_117 0.0104 ms 86.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:16.3556772Z triton_convolution2d_116 0.0106 ms 85.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:16.3557904Z triton_convolution2d_111 0.0116 ms 77.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:16.3559030Z triton_convolution2d_112 0.0118 ms 76.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:16.3560184Z triton_convolution2d_113 0.0124 ms 72.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:16.3560889Z conv1x1_via_mm 0.0288 ms 31.3% 2025-09-07T09:43:16.3561328Z SingleProcess AUTOTUNE benchmarking takes 0.4243 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:43:16.7612143Z Autotune Choices Stats: 2025-09-07T09:43:16.7613625Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01033599954098463, "best_triton_pos": 1, "best_triton_time": 0.014271999709308147, "best_triton_kernel": "triton_convolution2d_129", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:43:16.8232341Z AUTOTUNE convolution(8x288x28x28, 144x288x1x1) 2025-09-07T09:43:16.8232719Z strides: [225792, 784, 28, 1], [288, 1, 1, 1] 2025-09-07T09:43:16.8233019Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:16.8233285Z convolution 0.0103 ms 100.0% 2025-09-07T09:43:16.8234492Z triton_convolution2d_129 0.0143 ms 72.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:16.8235967Z triton_convolution2d_130 0.0165 ms 62.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:16.8237198Z triton_convolution2d_128 0.0168 ms 61.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:16.8238343Z triton_convolution2d_131 0.0174 ms 59.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:16.8239498Z triton_convolution2d_125 0.0213 ms 48.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:16.8240642Z triton_convolution2d_126 0.0231 ms 44.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:16.8241790Z triton_convolution2d_127 0.0262 ms 39.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:16.8242494Z conv1x1_via_mm 0.0441 ms 23.4% 2025-09-07T09:43:16.8242937Z SingleProcess AUTOTUNE benchmarking takes 0.4658 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:43:17.1526106Z Autotune Choices Stats: 2025-09-07T09:43:17.1527647Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.030880000442266464, "best_triton_pos": 1, "best_triton_time": 0.0544000007212162, "best_triton_kernel": "triton_convolution2d_138", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T09:43:17.1647707Z AUTOTUNE convolution(8x144x28x28, 144x144x3x3) 2025-09-07T09:43:17.1648115Z strides: [112896, 784, 28, 1], [1296, 9, 3, 1] 2025-09-07T09:43:17.1648418Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:17.1648745Z convolution 0.0309 ms 100.0% 2025-09-07T09:43:17.1649532Z triton_convolution2d_138 0.0544 ms 56.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:17.1650829Z triton_convolution2d_136 0.0570 ms 54.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:17.1652090Z triton_convolution2d_135 0.0580 ms 53.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:17.1653320Z triton_convolution2d_137 0.0602 ms 51.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:17.1654932Z triton_convolution2d_133 0.0677 ms 45.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:17.1657047Z triton_convolution2d_132 0.0786 ms 39.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:17.1658233Z triton_convolution2d_134 0.1224 ms 25.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:17.1659153Z SingleProcess AUTOTUNE benchmarking takes 0.3400 seconds and 0.0003 seconds precompiling for 8 choices 2025-09-07T09:43:17.6586850Z Autotune Choices Stats: 2025-09-07T09:43:17.6588592Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.011552000418305397, "best_triton_pos": 1, "best_triton_time": 0.019200000911951065, "best_triton_kernel": "triton_convolution2d_171", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:43:17.6914276Z AUTOTUNE convolution(8x432x28x28, 288x432x1x1) 2025-09-07T09:43:17.6914635Z strides: [338688, 784, 28, 1], [432, 1, 1, 1] 2025-09-07T09:43:17.6914929Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:17.6915514Z convolution 0.0116 ms 100.0% 2025-09-07T09:43:17.6916302Z triton_convolution2d_171 0.0192 ms 60.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:17.6917782Z triton_convolution2d_172 0.0217 ms 53.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:17.6919037Z triton_convolution2d_170 0.0220 ms 52.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:17.6920311Z triton_convolution2d_173 0.0221 ms 52.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:17.6921581Z triton_convolution2d_167 0.0295 ms 39.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:17.6922805Z triton_convolution2d_168 0.0304 ms 38.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:17.6924036Z triton_convolution2d_169 0.0350 ms 33.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:17.6924783Z conv1x1_via_mm 0.0678 ms 17.0% 2025-09-07T09:43:17.6925438Z SingleProcess AUTOTUNE benchmarking takes 0.5225 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:43:18.1346604Z Autotune Choices Stats: 2025-09-07T09:43:18.1348047Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04854400083422661, "best_triton_pos": 1, "best_triton_time": 0.07203199714422226, "best_triton_kernel": "triton_convolution2d_178", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T09:43:18.1646475Z AUTOTUNE convolution(8x288x28x28, 304x288x3x3) 2025-09-07T09:43:18.1646835Z strides: [225792, 784, 28, 1], [2592, 9, 3, 1] 2025-09-07T09:43:18.1647158Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:18.1647452Z convolution 0.0485 ms 100.0% 2025-09-07T09:43:18.1648557Z triton_convolution2d_178 0.0720 ms 67.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:18.1649813Z triton_convolution2d_177 0.1024 ms 47.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:18.1651039Z triton_convolution2d_180 0.1054 ms 46.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:18.1652275Z triton_convolution2d_179 0.1307 ms 37.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:18.1653486Z triton_convolution2d_175 0.1379 ms 35.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:18.1654703Z triton_convolution2d_174 0.2713 ms 17.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:18.1656139Z triton_convolution2d_176 0.3201 ms 15.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:18.1657100Z SingleProcess AUTOTUNE benchmarking takes 0.4719 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:43:18.2966939Z Autotune Choices Stats: 2025-09-07T09:43:18.2968371Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.013311999849975109, "best_triton_pos": 1, "best_triton_time": 0.014911999925971031, "best_triton_kernel": "triton_convolution2d_185", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:43:18.3421760Z AUTOTUNE convolution(8x304x14x14, 304x304x1x1) 2025-09-07T09:43:18.3422095Z strides: [59584, 196, 14, 1], [304, 1, 1, 1] 2025-09-07T09:43:18.3422419Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:18.3422735Z convolution 0.0133 ms 100.0% 2025-09-07T09:43:18.3423501Z triton_convolution2d_185 0.0149 ms 89.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:18.3424779Z triton_convolution2d_184 0.0172 ms 77.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:18.3426480Z triton_convolution2d_186 0.0172 ms 77.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:18.3427871Z triton_convolution2d_187 0.0183 ms 72.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:18.3429427Z triton_convolution2d_181 0.0226 ms 59.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:18.3430660Z triton_convolution2d_182 0.0236 ms 56.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:18.3431596Z conv1x1_via_mm 0.0278 ms 47.8% 2025-09-07T09:43:18.3432337Z triton_convolution2d_183 0.0280 ms 47.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:18.3433312Z SingleProcess AUTOTUNE benchmarking takes 0.1763 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:43:18.5213018Z Autotune Choices Stats: 2025-09-07T09:43:18.5214386Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04323200136423111, "best_triton_pos": 1, "best_triton_time": 0.08352000266313553, "best_triton_kernel": "triton_convolution2d_192", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T09:43:18.5547636Z AUTOTUNE convolution(8x304x14x14, 152x304x3x3) 2025-09-07T09:43:18.5547973Z strides: [59584, 196, 14, 1], [2736, 9, 3, 1] 2025-09-07T09:43:18.5548257Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:18.5548515Z convolution 0.0432 ms 100.0% 2025-09-07T09:43:18.5549255Z triton_convolution2d_192 0.0835 ms 51.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:18.5550482Z triton_convolution2d_194 0.1094 ms 39.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:18.5551700Z triton_convolution2d_191 0.1096 ms 39.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:18.5552928Z triton_convolution2d_193 0.1148 ms 37.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:18.5554134Z triton_convolution2d_189 0.1377 ms 31.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:18.5555508Z triton_convolution2d_188 0.1445 ms 29.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:18.5556748Z triton_convolution2d_190 0.2474 ms 17.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:18.5557673Z SingleProcess AUTOTUNE benchmarking takes 0.2111 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:43:18.6866052Z Autotune Choices Stats: 2025-09-07T09:43:18.6868094Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_199", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.010400000028312206, "best_triton_pos": 0} 2025-09-07T09:43:18.8414732Z AUTOTUNE convolution(8x152x14x14, 304x152x1x1) 2025-09-07T09:43:18.8415427Z strides: [29792, 196, 14, 1], [152, 1, 1, 1] 2025-09-07T09:43:18.8415732Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:18.8416541Z triton_convolution2d_199 0.0104 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:18.8417538Z convolution 0.0112 ms 93.1% 2025-09-07T09:43:18.8419152Z triton_convolution2d_198 0.0125 ms 82.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:18.8421154Z triton_convolution2d_200 0.0128 ms 81.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:18.8423292Z triton_convolution2d_201 0.0130 ms 79.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:18.8425460Z triton_convolution2d_195 0.0159 ms 65.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:18.8427501Z triton_convolution2d_196 0.0162 ms 64.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:18.8429014Z triton_convolution2d_197 0.0176 ms 59.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:18.8429773Z conv1x1_via_mm 0.0263 ms 39.6% 2025-09-07T09:43:18.8430245Z SingleProcess AUTOTUNE benchmarking takes 0.2855 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:43:18.9748766Z Autotune Choices Stats: 2025-09-07T09:43:18.9750124Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01865600049495697, "best_triton_pos": 1, "best_triton_time": 0.021727999672293663, "best_triton_kernel": "triton_convolution2d_213", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:43:19.3703356Z AUTOTUNE convolution(8x608x14x14, 304x608x1x1) 2025-09-07T09:43:19.3703898Z strides: [119168, 196, 14, 1], [608, 1, 1, 1] 2025-09-07T09:43:19.3704369Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:19.3704819Z convolution 0.0187 ms 100.0% 2025-09-07T09:43:19.3706490Z triton_convolution2d_213 0.0217 ms 85.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:19.3708268Z triton_convolution2d_212 0.0257 ms 72.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:19.3709494Z triton_convolution2d_214 0.0263 ms 71.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:19.3710706Z triton_convolution2d_215 0.0267 ms 69.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:19.3711911Z triton_convolution2d_209 0.0336 ms 55.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:19.3712899Z conv1x1_via_mm 0.0348 ms 53.6% 2025-09-07T09:43:19.3713622Z triton_convolution2d_210 0.0392 ms 47.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:19.3715143Z triton_convolution2d_211 0.0500 ms 37.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:19.3716155Z SingleProcess AUTOTUNE benchmarking takes 0.5267 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:43:19.5662782Z Autotune Choices Stats: 2025-09-07T09:43:19.5665490Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04287999868392944, "best_triton_pos": 1, "best_triton_time": 0.08230400085449219, "best_triton_kernel": "triton_convolution2d_220", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T09:43:19.6154795Z AUTOTUNE convolution(8x304x14x14, 304x304x3x3) 2025-09-07T09:43:19.6155243Z strides: [59584, 196, 14, 1], [2736, 9, 3, 1] 2025-09-07T09:43:19.6155546Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:19.6155826Z convolution 0.0429 ms 100.0% 2025-09-07T09:43:19.6156573Z triton_convolution2d_220 0.0823 ms 52.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:19.6158667Z triton_convolution2d_222 0.1098 ms 39.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:19.6160626Z triton_convolution2d_219 0.1131 ms 37.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:19.6162571Z triton_convolution2d_217 0.1381 ms 31.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:19.6164516Z triton_convolution2d_221 0.1417 ms 30.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:19.6166755Z triton_convolution2d_218 0.2428 ms 17.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:19.6168152Z triton_convolution2d_216 0.2738 ms 15.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:19.6169051Z SingleProcess AUTOTUNE benchmarking takes 0.2437 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:43:19.7662573Z Autotune Choices Stats: 2025-09-07T09:43:19.7663863Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.022943999618291855, "best_triton_pos": 1, "best_triton_time": 0.029664000496268272, "best_triton_kernel": "triton_convolution2d_255", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:43:19.7743096Z AUTOTUNE convolution(8x912x14x14, 480x912x1x1) 2025-09-07T09:43:19.7743758Z strides: [178752, 196, 14, 1], [912, 1, 1, 1] 2025-09-07T09:43:19.7744250Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:19.7744738Z convolution 0.0229 ms 100.0% 2025-09-07T09:43:19.7746200Z triton_convolution2d_255 0.0297 ms 77.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:19.7748505Z triton_convolution2d_254 0.0364 ms 63.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:19.7749754Z triton_convolution2d_256 0.0380 ms 60.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:19.7750987Z triton_convolution2d_257 0.0381 ms 60.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:19.7751731Z conv1x1_via_mm 0.0494 ms 46.4% 2025-09-07T09:43:19.7752453Z triton_convolution2d_251 0.0545 ms 42.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:19.7753673Z triton_convolution2d_252 0.0563 ms 40.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:19.7754893Z triton_convolution2d_253 0.0695 ms 33.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:19.7756027Z SingleProcess AUTOTUNE benchmarking takes 0.1543 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:43:20.4633988Z Autotune Choices Stats: 2025-09-07T09:43:20.4635664Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.055615998804569244, "best_triton_pos": 1, "best_triton_time": 0.12716799974441528, "best_triton_kernel": "triton_convolution2d_262", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T09:43:20.5241745Z AUTOTUNE convolution(8x480x14x14, 960x480x3x3) 2025-09-07T09:43:20.5242091Z strides: [94080, 196, 14, 1], [4320, 9, 3, 1] 2025-09-07T09:43:20.5242410Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:20.5242706Z convolution 0.0556 ms 100.0% 2025-09-07T09:43:20.5243464Z triton_convolution2d_262 0.1272 ms 43.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:20.5244748Z triton_convolution2d_264 0.1819 ms 30.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:20.5246152Z triton_convolution2d_261 0.1845 ms 30.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:20.5247504Z triton_convolution2d_263 0.2244 ms 24.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:20.5248670Z triton_convolution2d_259 0.2326 ms 23.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:20.5250056Z triton_convolution2d_260 0.3466 ms 16.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:20.5251394Z triton_convolution2d_258 0.4300 ms 12.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:20.5252298Z SingleProcess AUTOTUNE benchmarking takes 0.7488 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:43:21.1690769Z Autotune Choices Stats: 2025-09-07T09:43:21.1692977Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.09251199662685394, "best_triton_pos": 1, "best_triton_time": 0.2022400051355362, "best_triton_kernel": "triton_convolution2d_269", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T09:43:21.2611895Z AUTOTUNE convolution(8x960x7x7, 1024x960x3x3) 2025-09-07T09:43:21.2624516Z strides: [47040, 49, 7, 1], [8640, 9, 3, 1] 2025-09-07T09:43:21.2624835Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:21.2625303Z convolution 0.0925 ms 100.0% 2025-09-07T09:43:21.2625993Z triton_convolution2d_269 0.2022 ms 45.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:21.2627099Z triton_convolution2d_271 0.3278 ms 28.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:21.2628226Z triton_convolution2d_268 0.3581 ms 25.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:21.2629383Z triton_convolution2d_266 0.4237 ms 21.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:21.2630519Z triton_convolution2d_270 0.4335 ms 21.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:21.2631655Z triton_convolution2d_267 0.4644 ms 19.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=512, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:21.2632799Z triton_convolution2d_265 0.8181 ms 11.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:21.2633699Z SingleProcess AUTOTUNE benchmarking takes 0.7357 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:43:22.0163261Z Autotune Choices Stats: 2025-09-07T09:43:22.0165996Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.10931199789047241, "best_triton_pos": 1, "best_triton_time": 0.26025599241256714, "best_triton_kernel": "triton_convolution2d_276", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T09:43:22.2406021Z AUTOTUNE convolution(8x1024x7x7, 1280x1024x3x3) 2025-09-07T09:43:22.4661849Z strides: [50176, 49, 7, 1], [9216, 9, 3, 1] 2025-09-07T09:43:22.4662337Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:22.4662778Z convolution 0.1093 ms 100.0% 2025-09-07T09:43:22.4663976Z triton_convolution2d_276 0.2603 ms 42.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:22.4666830Z triton_convolution2d_278 0.3412 ms 32.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:22.4669090Z triton_convolution2d_275 0.3818 ms 28.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:22.4670442Z triton_convolution2d_277 0.4600 ms 23.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:43:22.4671672Z triton_convolution2d_273 0.5040 ms 21.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:22.4672899Z triton_convolution2d_274 0.5188 ms 21.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=512, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:43:22.4674119Z triton_convolution2d_272 1.1535 ms 9.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:43:22.4675219Z SingleProcess AUTOTUNE benchmarking takes 0.9783 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:43:22.5697165Z Autotune Choices Stats: 2025-09-07T09:43:22.5699315Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.012608000077307224, "best_triton_pos": 2, "best_triton_time": 0.03596799820661545, "best_triton_kernel": "triton_convolution2d_283", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:43:22.6985971Z AUTOTUNE convolution(8x1280x4x4, 1024x1280x1x1) 2025-09-07T09:43:22.6986544Z strides: [20480, 16, 4, 1], [1280, 1, 1, 1] 2025-09-07T09:43:22.6987027Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:22.6987530Z convolution 0.0126 ms 100.0% 2025-09-07T09:43:22.6987965Z conv1x1_via_mm 0.0262 ms 48.2% 2025-09-07T09:43:22.6989386Z triton_convolution2d_283 0.0360 ms 35.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:22.6990673Z triton_convolution2d_285 0.0436 ms 28.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:22.6991926Z triton_convolution2d_284 0.0450 ms 28.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:22.6993163Z triton_convolution2d_282 0.0466 ms 27.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:43:22.6994629Z triton_convolution2d_281 0.0556 ms 22.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:43:22.6996373Z triton_convolution2d_280 0.0666 ms 18.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:22.6997601Z triton_convolution2d_279 0.0725 ms 17.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:43:22.6998663Z SingleProcess AUTOTUNE benchmarking takes 0.4568 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:43:23.1160513Z Autotune Choices Stats: 2025-09-07T09:43:23.1161612Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_290", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.009119999594986439, "best_triton_pos": 0} 2025-09-07T09:43:23.1290021Z AUTOTUNE addmm(8x1000, 8x1024, 1024x1000) 2025-09-07T09:43:23.1290473Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T09:43:23.1290820Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:43:23.1291551Z triton_mm_290 0.0091 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:43:23.1292564Z triton_mm_294 0.0098 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:43:23.1293192Z bias_addmm 0.0102 ms 89.6% 2025-09-07T09:43:23.1293801Z triton_mm_298 0.0111 ms 81.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:43:23.1294785Z triton_mm_302 0.0115 ms 79.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:43:23.1296033Z triton_mm_289 0.0121 ms 75.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:43:23.1296992Z triton_mm_288 0.0125 ms 72.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:43:23.1298087Z triton_mm_293 0.0127 ms 71.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:43:23.1299030Z triton_mm_287 0.0132 ms 69.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:43:23.1299980Z triton_mm_297 0.0134 ms 68.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:43:23.1300813Z SingleProcess AUTOTUNE benchmarking takes 0.4289 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:43:35.4014663Z Autotune Choices Stats: 2025-09-07T09:43:35.4016150Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_329", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.006816000211983919, "best_triton_pos": 0} 2025-09-07T09:43:35.6296355Z AUTOTUNE mm(1000x8, 8x1024) 2025-09-07T09:43:35.6296797Z strides: [1, 1000], [1024, 1] 2025-09-07T09:43:35.6297225Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:35.6298317Z triton_mm_329 0.0068 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:43:35.6300636Z triton_mm_327 0.0069 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:43:35.6302294Z triton_mm_324 0.0070 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:43:35.6303429Z triton_mm_325 0.0070 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:43:35.6304350Z triton_mm_331 0.0070 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:43:35.6305549Z triton_mm_328 0.0070 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:43:35.6306452Z triton_mm_322 0.0071 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:43:35.6307334Z triton_mm_330 0.0071 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:43:35.6308234Z triton_mm_332 0.0071 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:43:35.6309140Z triton_mm_323 0.0072 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:43:35.6309928Z SingleProcess AUTOTUNE benchmarking takes 0.5461 seconds and 0.0003 seconds precompiling for 17 choices 2025-09-07T09:43:36.7244633Z Autotune Choices Stats: 2025-09-07T09:43:36.7246318Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "mm", "best_time": 0.00940799992531538, "best_triton_pos": 1, "best_triton_time": 0.009824000298976898, "best_triton_kernel": "triton_mm_311", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:43:37.0280553Z AUTOTUNE mm(8x1000, 1000x1024) 2025-09-07T09:43:37.0280861Z strides: [1000, 1], [1024, 1] 2025-09-07T09:43:37.0281124Z dtypes: torch.float16, torch.float16 2025-09-07T09:43:37.0281388Z mm 0.0094 ms 100.0% 2025-09-07T09:43:37.0282094Z triton_mm_311 0.0098 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:43:37.0283198Z triton_mm_307 0.0100 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:43:37.0284229Z triton_mm_315 0.0103 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:43:37.0285392Z triton_mm_305 0.0115 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:43:37.0286371Z triton_mm_306 0.0116 ms 81.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:43:37.0287330Z triton_mm_319 0.0117 ms 80.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:43:37.0288754Z triton_mm_310 0.0120 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:43:37.0289713Z triton_mm_317 0.0128 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:43:37.0290908Z triton_mm_314 0.0130 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:43:37.0291815Z SingleProcess AUTOTUNE benchmarking takes 0.5887 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:43:45.1949779Z W0907 09:43:45.194000 51673 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:44:02.4461028Z pass 2025-09-07T09:44:07.5340845Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:44:07.5342711Z import pynvml # type: ignore[import] 2025-09-07T09:44:10.6477290Z 2025-09-07T09:44:11.9545915Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:44:11.9546276Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:44:11.9546573Z cuda train spnasnet_100 2025-09-07T09:44:39.5126093Z Autotune Choices Stats: 2025-09-07T09:44:39.5128441Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.015519999898970127, "best_triton_pos": 1, "best_triton_time": 0.023615999147295952, "best_triton_kernel": "triton_convolution2d_4", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T09:44:39.5824578Z AUTOTUNE convolution(8x3x224x224, 32x3x3x3) 2025-09-07T09:44:39.5825394Z strides: [150528, 1, 672, 3], [27, 1, 9, 3] 2025-09-07T09:44:39.5825885Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:39.5826340Z convolution 0.0155 ms 100.0% 2025-09-07T09:44:39.5827530Z triton_convolution2d_4 0.0236 ms 65.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:44:39.5829511Z triton_convolution2d_0 0.0271 ms 57.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:44:39.5831951Z triton_convolution2d_2 0.0276 ms 56.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:44:39.5833256Z triton_convolution2d_3 0.0287 ms 54.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:44:39.5834485Z triton_convolution2d_5 0.0342 ms 45.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:44:39.5835853Z triton_convolution2d_1 0.0422 ms 36.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:44:39.5836842Z SingleProcess AUTOTUNE benchmarking takes 0.1626 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T09:44:40.0951660Z Autotune Choices Stats: 2025-09-07T09:44:40.0952988Z {"num_choices": 13, "num_triton_choices": 12, "best_kernel": "triton_mm_11", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.00940799992531538, "best_triton_pos": 0} 2025-09-07T09:44:40.2974095Z AUTOTUNE mm(100352x32, 32x16) 2025-09-07T09:44:40.2974602Z strides: [32, 1], [1, 32] 2025-09-07T09:44:40.2975541Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:40.2977138Z triton_mm_11 0.0094 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:40.2978787Z triton_mm_12 0.0095 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:40.2980354Z triton_mm_9 0.0096 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:40.2981918Z triton_mm_7 0.0096 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:44:40.2982801Z triton_mm_14 0.0096 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:40.2983738Z triton_mm_16 0.0096 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:44:40.2984625Z triton_mm_13 0.0097 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:40.2985662Z triton_mm_17 0.0097 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:40.2986559Z triton_mm_8 0.0097 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:44:40.2987450Z triton_mm_15 0.0097 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:40.2988244Z SingleProcess AUTOTUNE benchmarking takes 0.7132 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T09:44:40.4802785Z Autotune Choices Stats: 2025-09-07T09:44:40.4804344Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_mm_25", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.010367999784648418, "best_triton_pos": 0} 2025-09-07T09:44:40.5140942Z AUTOTUNE mm(100352x16, 16x48) 2025-09-07T09:44:40.5141317Z strides: [16, 1], [1, 16] 2025-09-07T09:44:40.5141703Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:40.5142358Z triton_mm_25 0.0104 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:40.5143335Z triton_mm_23 0.0104 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:40.5144286Z triton_mm_28 0.0105 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:40.5145397Z triton_mm_26 0.0105 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:40.5146658Z triton_mm_32 0.0108 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:40.5147652Z triton_mm_31 0.0109 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:44:40.5148807Z triton_mm_30 0.0110 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:40.5149875Z triton_mm_29 0.0110 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:40.5150874Z triton_mm_18 0.0111 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:44:40.5151830Z triton_mm_27 0.0111 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:40.5152672Z SingleProcess AUTOTUNE benchmarking takes 0.2158 seconds and 0.0003 seconds precompiling for 16 choices 2025-09-07T09:44:40.7351790Z Autotune Choices Stats: 2025-09-07T09:44:40.7353003Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_40", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.007679999805986881, "best_triton_pos": 0} 2025-09-07T09:44:40.7428133Z AUTOTUNE mm(25088x48, 48x24) 2025-09-07T09:44:40.7428563Z strides: [48, 1], [1, 48] 2025-09-07T09:44:40.7428986Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:40.7430040Z triton_mm_40 0.0077 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:40.7431738Z triton_mm_34 0.0078 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:40.7432691Z triton_mm_37 0.0079 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:40.7433664Z triton_mm_43 0.0079 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:40.7434612Z triton_mm_44 0.0080 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:40.7435887Z triton_mm_42 0.0080 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:40.7436844Z triton_mm_49 0.0080 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:40.7437800Z triton_mm_36 0.0081 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:40.7438762Z triton_mm_48 0.0081 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:40.7439723Z triton_mm_38 0.0082 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:40.7440824Z SingleProcess AUTOTUNE benchmarking takes 0.2279 seconds and 0.0004 seconds precompiling for 18 choices 2025-09-07T09:44:40.9532840Z Autotune Choices Stats: 2025-09-07T09:44:40.9533840Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_60", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.007584000006318092, "best_triton_pos": 0} 2025-09-07T09:44:41.1813424Z AUTOTUNE mm(25088x24, 24x72) 2025-09-07T09:44:41.1813948Z strides: [24, 1], [1, 24] 2025-09-07T09:44:41.1814233Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:41.1814905Z triton_mm_60 0.0076 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:41.1816274Z triton_mm_61 0.0077 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:41.1817258Z triton_mm_59 0.0078 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:41.1818225Z triton_mm_64 0.0079 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:44:41.1819192Z triton_mm_66 0.0080 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:41.1820212Z triton_mm_65 0.0081 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:41.1821044Z triton_mm_58 0.0083 ms 91.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:41.1821939Z triton_mm_54 0.0083 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:41.1822780Z triton_mm_56 0.0084 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:41.1823611Z triton_mm_63 0.0085 ms 89.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:41.1824342Z SingleProcess AUTOTUNE benchmarking takes 0.4377 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T09:44:41.3946771Z Autotune Choices Stats: 2025-09-07T09:44:41.3948346Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_76", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.008832000195980072, "best_triton_pos": 0} 2025-09-07T09:44:41.6146988Z AUTOTUNE mm(25088x72, 72x24) 2025-09-07T09:44:41.6147271Z strides: [72, 1], [1, 72] 2025-09-07T09:44:41.6147517Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:41.6148079Z triton_mm_76 0.0088 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:41.6148882Z triton_mm_70 0.0089 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:41.6149710Z triton_mm_80 0.0089 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:41.6150793Z triton_mm_78 0.0092 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:41.6151791Z triton_mm_68 0.0092 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:41.6152937Z triton_mm_67 0.0093 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:44:41.6153553Z mm 0.0093 ms 94.5% 2025-09-07T09:44:41.6154126Z triton_mm_71 0.0093 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:41.6155413Z triton_mm_81 0.0094 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:44:41.6156390Z triton_mm_72 0.0095 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:41.6157234Z SingleProcess AUTOTUNE benchmarking takes 0.4323 seconds and 0.0004 seconds precompiling for 18 choices 2025-09-07T09:44:41.8302009Z Autotune Choices Stats: 2025-09-07T09:44:41.8303013Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_124", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.01017600018531084, "best_triton_pos": 0} 2025-09-07T09:44:42.0471530Z AUTOTUNE mm(25088x24, 24x144) 2025-09-07T09:44:42.0471789Z strides: [24, 1], [1, 24] 2025-09-07T09:44:42.0472036Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:42.0472692Z triton_mm_124 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:42.0473668Z triton_mm_125 0.0103 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:42.0474643Z triton_mm_126 0.0103 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:42.0475918Z triton_mm_122 0.0103 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:42.0476891Z triton_mm_128 0.0104 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:42.0477919Z triton_mm_131 0.0106 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:42.0478899Z triton_mm_130 0.0106 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:42.0479861Z triton_mm_118 0.0109 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:44:42.0480831Z triton_mm_129 0.0109 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:42.0481752Z triton_mm_127 0.0109 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:42.0482746Z SingleProcess AUTOTUNE benchmarking takes 0.4303 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:44:42.6026391Z Autotune Choices Stats: 2025-09-07T09:44:42.6027957Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_146", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.007968000136315823, "best_triton_pos": 0} 2025-09-07T09:44:42.7548789Z AUTOTUNE mm(6272x144, 144x40) 2025-09-07T09:44:42.7549249Z strides: [144, 1], [1, 144] 2025-09-07T09:44:42.7549671Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:42.7550924Z triton_mm_146 0.0080 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:42.7552096Z triton_mm_142 0.0080 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:42.7553055Z triton_mm_145 0.0080 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:42.7554012Z triton_mm_144 0.0082 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:42.7554618Z mm 0.0082 ms 96.9% 2025-09-07T09:44:42.7555511Z triton_mm_149 0.0084 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:42.7556476Z triton_mm_138 0.0084 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:42.7557440Z triton_mm_148 0.0084 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:42.7558411Z triton_mm_151 0.0084 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:42.7559383Z triton_mm_147 0.0084 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:42.7560233Z SingleProcess AUTOTUNE benchmarking takes 0.7067 seconds and 0.0004 seconds precompiling for 19 choices 2025-09-07T09:44:42.9858959Z Autotune Choices Stats: 2025-09-07T09:44:42.9860604Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_164", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.00723200011998415, "best_triton_pos": 0} 2025-09-07T09:44:42.9875340Z AUTOTUNE mm(6272x40, 40x120) 2025-09-07T09:44:42.9875606Z strides: [40, 1], [1, 40] 2025-09-07T09:44:42.9875837Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:42.9876499Z triton_mm_164 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:42.9877478Z triton_mm_157 0.0074 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:42.9878463Z triton_mm_160 0.0075 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:42.9879696Z triton_mm_161 0.0075 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:42.9880692Z triton_mm_165 0.0075 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:42.9881635Z triton_mm_154 0.0077 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:42.9882660Z triton_mm_171 0.0078 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:42.9883569Z triton_mm_170 0.0079 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:42.9884476Z triton_mm_153 0.0079 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:44:42.9885534Z triton_mm_166 0.0080 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:42.9886322Z SingleProcess AUTOTUNE benchmarking takes 0.2319 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T09:44:43.6592742Z Autotune Choices Stats: 2025-09-07T09:44:43.6593791Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_173", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.007712000049650669, "best_triton_pos": 0} 2025-09-07T09:44:43.8815422Z AUTOTUNE mm(6272x120, 120x40) 2025-09-07T09:44:43.8815786Z strides: [120, 1], [1, 120] 2025-09-07T09:44:43.8816065Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:43.8816785Z triton_mm_173 0.0077 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:43.8817463Z mm 0.0078 ms 99.2% 2025-09-07T09:44:43.8818067Z triton_mm_189 0.0083 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:43.8819067Z triton_mm_175 0.0084 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:43.8820049Z triton_mm_188 0.0084 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:43.8821024Z triton_mm_183 0.0084 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:43.8822113Z triton_mm_186 0.0085 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:43.8823022Z triton_mm_180 0.0085 ms 90.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:43.8823929Z triton_mm_184 0.0085 ms 90.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:43.8824837Z triton_mm_185 0.0085 ms 90.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:43.8826168Z SingleProcess AUTOTUNE benchmarking takes 0.8931 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:44:44.1117737Z Autotune Choices Stats: 2025-09-07T09:44:44.1118730Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_272", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.007648000027984381, "best_triton_pos": 0} 2025-09-07T09:44:44.1134389Z AUTOTUNE mm(6272x40, 40x240) 2025-09-07T09:44:44.1135248Z strides: [40, 1], [1, 40] 2025-09-07T09:44:44.1135514Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:44.1136118Z triton_mm_272 0.0076 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:44.1137046Z triton_mm_271 0.0079 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:44.1137956Z triton_mm_281 0.0080 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:44.1138886Z triton_mm_282 0.0080 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:44.1139461Z mm 0.0082 ms 93.7% 2025-09-07T09:44:44.1139989Z triton_mm_276 0.0082 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:44.1140876Z triton_mm_268 0.0082 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:44.1141871Z triton_mm_275 0.0083 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:44.1142787Z triton_mm_280 0.0084 ms 90.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:44.1143693Z triton_mm_264 0.0085 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:44:44.1144480Z SingleProcess AUTOTUNE benchmarking takes 0.2288 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:44:44.3417066Z Autotune Choices Stats: 2025-09-07T09:44:44.3418648Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_287", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.007360000163316727, "best_triton_pos": 0} 2025-09-07T09:44:44.3477412Z AUTOTUNE mm(1568x240, 240x80) 2025-09-07T09:44:44.3477656Z strides: [240, 1], [1, 240] 2025-09-07T09:44:44.3477914Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:44.3478553Z triton_mm_287 0.0074 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:44.3479541Z triton_mm_291 0.0076 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:44.3480153Z mm 0.0076 ms 96.6% 2025-09-07T09:44:44.3480711Z triton_mm_286 0.0077 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:44.3481909Z triton_mm_285 0.0077 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:44.3482797Z triton_mm_290 0.0077 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:44.3483820Z triton_mm_284 0.0079 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:44.3484726Z triton_mm_294 0.0080 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:44.3485918Z triton_mm_295 0.0083 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:44.3486836Z triton_mm_293 0.0084 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:44.3487615Z SingleProcess AUTOTUNE benchmarking takes 0.2337 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:44:44.5775677Z Autotune Choices Stats: 2025-09-07T09:44:44.5776580Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_309", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.007071999832987785, "best_triton_pos": 0} 2025-09-07T09:44:44.5823330Z AUTOTUNE mm(1568x80, 80x240) 2025-09-07T09:44:44.5823782Z strides: [80, 1], [1, 80] 2025-09-07T09:44:44.5824172Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:44.5825543Z triton_mm_309 0.0071 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:44.5827143Z triton_mm_316 0.0071 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:44.5828682Z triton_mm_311 0.0072 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:44.5830218Z triton_mm_312 0.0072 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:44.5832235Z triton_mm_315 0.0073 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:44.5833267Z triton_mm_313 0.0073 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:44.5833892Z mm 0.0074 ms 96.1% 2025-09-07T09:44:44.5834450Z triton_mm_305 0.0074 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:44.5835566Z triton_mm_303 0.0074 ms 95.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:44.5836533Z triton_mm_304 0.0074 ms 95.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:44.5837379Z SingleProcess AUTOTUNE benchmarking takes 0.2340 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:44:44.8142556Z Autotune Choices Stats: 2025-09-07T09:44:44.8143866Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_430", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.0074880002066493034, "best_triton_pos": 0} 2025-09-07T09:44:44.8158642Z AUTOTUNE mm(1568x80, 80x480) 2025-09-07T09:44:44.8158961Z strides: [80, 1], [1, 80] 2025-09-07T09:44:44.8159278Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:44.8160368Z triton_mm_430 0.0075 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:44.8161671Z triton_mm_427 0.0077 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:44.8162946Z triton_mm_425 0.0077 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:44.8164231Z triton_mm_426 0.0078 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:44.8165642Z triton_mm_429 0.0078 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:44.8166913Z triton_mm_419 0.0079 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:44.8167715Z mm 0.0080 ms 93.2% 2025-09-07T09:44:44.8168459Z triton_mm_428 0.0081 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:44.8169762Z triton_mm_417 0.0083 ms 90.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:44.8171090Z triton_mm_432 0.0083 ms 90.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:44.8172269Z SingleProcess AUTOTUNE benchmarking takes 0.2300 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:44:45.0456572Z Autotune Choices Stats: 2025-09-07T09:44:45.0457542Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_439", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.0080960001796484, "best_triton_pos": 0} 2025-09-07T09:44:45.0472497Z AUTOTUNE mm(1568x480, 480x96) 2025-09-07T09:44:45.0472747Z strides: [480, 1], [1, 480] 2025-09-07T09:44:45.0473003Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:45.0473660Z triton_mm_439 0.0081 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:45.0474304Z mm 0.0084 ms 96.9% 2025-09-07T09:44:45.0474892Z triton_mm_443 0.0086 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:45.0476179Z triton_mm_438 0.0092 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:45.0477165Z triton_mm_447 0.0093 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:45.0478353Z triton_mm_437 0.0093 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:45.0479319Z triton_mm_442 0.0093 ms 86.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:45.0480446Z triton_mm_436 0.0099 ms 81.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:45.0481432Z triton_mm_446 0.0100 ms 80.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:45.0482371Z triton_mm_445 0.0102 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:45.0483147Z SingleProcess AUTOTUNE benchmarking takes 0.2308 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:44:45.2754289Z Autotune Choices Stats: 2025-09-07T09:44:45.2755484Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_457", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.007327999919652939, "best_triton_pos": 0} 2025-09-07T09:44:45.2795695Z AUTOTUNE mm(1568x96, 96x288) 2025-09-07T09:44:45.2795950Z strides: [96, 1], [1, 96] 2025-09-07T09:44:45.2796197Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:45.2796825Z triton_mm_457 0.0073 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:45.2797435Z mm 0.0075 ms 98.3% 2025-09-07T09:44:45.2797996Z triton_mm_456 0.0076 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:45.2798944Z triton_mm_461 0.0077 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:45.2799893Z triton_mm_468 0.0077 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:45.2800849Z triton_mm_464 0.0077 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:45.2801800Z triton_mm_465 0.0078 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:45.2802766Z triton_mm_455 0.0079 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:45.2803724Z triton_mm_463 0.0079 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:45.2804689Z triton_mm_466 0.0079 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:45.2805678Z SingleProcess AUTOTUNE benchmarking takes 0.2318 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:44:45.5047330Z Autotune Choices Stats: 2025-09-07T09:44:45.5048234Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_477", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.007584000006318092, "best_triton_pos": 0} 2025-09-07T09:44:45.7181876Z AUTOTUNE mm(1568x288, 288x96) 2025-09-07T09:44:45.7182227Z strides: [288, 1], [1, 288] 2025-09-07T09:44:45.7182443Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:45.7183014Z triton_mm_477 0.0076 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:45.7183574Z mm 0.0078 ms 97.1% 2025-09-07T09:44:45.7184266Z triton_mm_481 0.0078 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:45.7185409Z triton_mm_476 0.0079 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:45.7186252Z triton_mm_475 0.0081 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:45.7187097Z triton_mm_480 0.0083 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:45.7187927Z triton_mm_483 0.0086 ms 88.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:45.7188766Z triton_mm_484 0.0086 ms 88.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:45.7189601Z triton_mm_474 0.0087 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:45.7190444Z triton_mm_487 0.0087 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:45.7191175Z SingleProcess AUTOTUNE benchmarking takes 0.4379 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:44:45.9518029Z Autotune Choices Stats: 2025-09-07T09:44:45.9518987Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_582", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.007935999892652035, "best_triton_pos": 0} 2025-09-07T09:44:45.9554508Z AUTOTUNE mm(1568x96, 96x576) 2025-09-07T09:44:45.9554766Z strides: [96, 1], [1, 96] 2025-09-07T09:44:45.9555248Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:45.9555910Z triton_mm_582 0.0079 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:45.9556922Z triton_mm_575 0.0080 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:45.9557896Z triton_mm_578 0.0081 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:45.9558861Z triton_mm_581 0.0081 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:45.9559831Z triton_mm_579 0.0082 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:45.9560452Z mm 0.0083 ms 95.8% 2025-09-07T09:44:45.9561244Z triton_mm_585 0.0083 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:45.9562210Z triton_mm_577 0.0085 ms 93.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:45.9563130Z triton_mm_586 0.0085 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:45.9564163Z triton_mm_580 0.0086 ms 92.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:45.9565101Z SingleProcess AUTOTUNE benchmarking takes 0.2341 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:44:46.1914825Z Autotune Choices Stats: 2025-09-07T09:44:46.1916198Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.007840000092983246, "best_triton_pos": 1, "best_triton_time": 0.007935999892652035, "best_triton_kernel": "triton_mm_591", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:44:46.4212735Z AUTOTUNE mm(392x576, 576x192) 2025-09-07T09:44:46.4213139Z strides: [576, 1], [1, 576] 2025-09-07T09:44:46.4213461Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:46.4213823Z mm 0.0078 ms 100.0% 2025-09-07T09:44:46.4214573Z triton_mm_591 0.0079 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:46.4216254Z triton_mm_595 0.0083 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:46.4217423Z triton_mm_590 0.0092 ms 84.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:46.4218544Z triton_mm_594 0.0093 ms 83.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:46.4219681Z triton_mm_589 0.0094 ms 83.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:46.4220815Z triton_mm_598 0.0098 ms 80.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:46.4222196Z triton_mm_588 0.0100 ms 78.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:46.4223094Z triton_mm_605 0.0102 ms 76.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:46.4223975Z triton_mm_597 0.0104 ms 75.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:46.4224750Z SingleProcess AUTOTUNE benchmarking takes 0.4647 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:44:46.6531279Z Autotune Choices Stats: 2025-09-07T09:44:46.6532894Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_614", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.007519999984651804, "best_triton_pos": 0} 2025-09-07T09:44:46.8876313Z AUTOTUNE mm(392x192, 192x1152) 2025-09-07T09:44:46.8876572Z strides: [192, 1], [1, 192] 2025-09-07T09:44:46.8876822Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:46.8877477Z triton_mm_614 0.0075 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:46.8878462Z triton_mm_613 0.0080 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:46.8879692Z triton_mm_617 0.0082 ms 91.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:46.8880668Z triton_mm_608 0.0083 ms 90.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:46.8881636Z triton_mm_609 0.0083 ms 90.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:46.8882595Z triton_mm_620 0.0083 ms 90.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:46.8883179Z mm 0.0084 ms 89.4% 2025-09-07T09:44:46.8883709Z triton_mm_616 0.0085 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:46.8884610Z triton_mm_618 0.0086 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:46.8885692Z triton_mm_624 0.0086 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:46.8886490Z SingleProcess AUTOTUNE benchmarking takes 0.4643 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T09:44:47.1205507Z Autotune Choices Stats: 2025-09-07T09:44:47.1207603Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.00863999966531992, "best_triton_pos": 1, "best_triton_time": 0.009119999594986439, "best_triton_kernel": "triton_mm_629", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:44:47.1224783Z AUTOTUNE mm(392x1152, 1152x192) 2025-09-07T09:44:47.1225124Z strides: [1152, 1], [1, 1152] 2025-09-07T09:44:47.1225356Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:47.1225577Z mm 0.0086 ms 100.0% 2025-09-07T09:44:47.1226083Z triton_mm_629 0.0091 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:47.1226926Z triton_mm_633 0.0096 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:47.1227768Z triton_mm_637 0.0107 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:47.1228601Z triton_mm_628 0.0121 ms 71.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:47.1229422Z triton_mm_627 0.0125 ms 69.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:47.1230239Z triton_mm_632 0.0126 ms 68.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:47.1231306Z triton_mm_643 0.0130 ms 66.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:47.1232308Z triton_mm_626 0.0132 ms 65.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:47.1233445Z triton_mm_636 0.0133 ms 65.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:47.1234293Z SingleProcess AUTOTUNE benchmarking takes 0.2343 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:44:47.3672112Z Autotune Choices Stats: 2025-09-07T09:44:47.3673659Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_743", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008991999551653862, "best_triton_pos": 0} 2025-09-07T09:44:47.5905685Z AUTOTUNE mm(392x1152, 1152x320) 2025-09-07T09:44:47.5906146Z strides: [1152, 1], [1, 1152] 2025-09-07T09:44:47.5906587Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:47.5907692Z triton_mm_743 0.0090 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:47.5908750Z mm 0.0093 ms 96.6% 2025-09-07T09:44:47.5909669Z triton_mm_747 0.0095 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:47.5911222Z triton_mm_751 0.0105 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:47.5912923Z triton_mm_742 0.0122 ms 73.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:47.5914139Z triton_mm_741 0.0125 ms 71.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:47.5915252Z triton_mm_746 0.0125 ms 71.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:47.5916227Z triton_mm_757 0.0128 ms 70.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:47.5917206Z triton_mm_750 0.0132 ms 68.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:47.5918169Z triton_mm_740 0.0132 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:47.5919017Z SingleProcess AUTOTUNE benchmarking takes 0.4614 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:44:47.8190429Z Autotune Choices Stats: 2025-09-07T09:44:47.8192412Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.008511999621987343, "best_triton_pos": 1, "best_triton_time": 0.008832000195980072, "best_triton_kernel": "triton_mm_765", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T09:44:48.0500534Z AUTOTUNE mm(392x320, 320x1280) 2025-09-07T09:44:48.0501566Z strides: [320, 1], [1, 320] 2025-09-07T09:44:48.0501994Z dtypes: torch.float16, torch.float16 2025-09-07T09:44:48.0502444Z mm 0.0085 ms 100.0% 2025-09-07T09:44:48.0503385Z triton_mm_765 0.0088 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:44:48.0505433Z triton_mm_769 0.0090 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:48.0507478Z triton_mm_770 0.0093 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:48.0509033Z triton_mm_768 0.0094 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:48.0510586Z triton_mm_772 0.0095 ms 89.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:44:48.0512116Z triton_mm_776 0.0095 ms 89.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:48.0514109Z triton_mm_759 0.0096 ms 88.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:44:48.0515291Z triton_mm_775 0.0096 ms 88.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:48.0516250Z triton_mm_760 0.0097 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:48.0517099Z SingleProcess AUTOTUNE benchmarking takes 0.4588 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:44:48.8160445Z Autotune Choices Stats: 2025-09-07T09:44:48.8161460Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_781", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.009344000369310379, "best_triton_pos": 0} 2025-09-07T09:44:48.8321469Z AUTOTUNE addmm(8x1000, 8x1280, 1280x1000) 2025-09-07T09:44:48.8321796Z strides: [0, 1], [1280, 1], [1, 1280] 2025-09-07T09:44:48.8322141Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:44:48.8323471Z triton_mm_781 0.0093 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:44:48.8325512Z triton_mm_785 0.0101 ms 92.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:48.8326529Z bias_addmm 0.0102 ms 91.8% 2025-09-07T09:44:48.8327501Z triton_mm_789 0.0112 ms 83.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:44:48.8329044Z triton_mm_793 0.0122 ms 76.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:44:48.8330569Z triton_mm_780 0.0133 ms 70.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:44:48.8331537Z addmm 0.0138 ms 67.6% 2025-09-07T09:44:48.8332510Z triton_mm_779 0.0140 ms 67.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:44:48.8333806Z triton_mm_784 0.0142 ms 65.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:44:48.8334630Z triton_mm_778 0.0146 ms 63.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:44:48.8335716Z SingleProcess AUTOTUNE benchmarking takes 0.7804 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T09:45:05.3716363Z Autotune Choices Stats: 2025-09-07T09:45:05.3717691Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_816", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.006688000168651342, "best_triton_pos": 0} 2025-09-07T09:45:05.4678919Z AUTOTUNE mm(1000x8, 8x1280) 2025-09-07T09:45:05.4679322Z strides: [1, 1000], [1280, 1] 2025-09-07T09:45:05.4679767Z dtypes: torch.float16, torch.float16 2025-09-07T09:45:05.4680835Z triton_mm_816 0.0067 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:45:05.4682451Z triton_mm_817 0.0067 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:45:05.4684017Z triton_mm_820 0.0069 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:45:05.4685866Z triton_mm_821 0.0069 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:45:05.4687502Z triton_mm_818 0.0069 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:45:05.4688470Z triton_mm_822 0.0069 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:45:05.4689450Z triton_mm_825 0.0069 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:45:05.4690419Z triton_mm_815 0.0071 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:45:05.4691381Z triton_mm_819 0.0071 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:45:05.4692347Z triton_mm_823 0.0071 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:45:05.4693203Z SingleProcess AUTOTUNE benchmarking takes 0.5552 seconds and 0.0003 seconds precompiling for 17 choices 2025-09-07T09:45:06.1808870Z Autotune Choices Stats: 2025-09-07T09:45:06.1810410Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "mm", "best_time": 0.009375999681651592, "best_triton_pos": 1, "best_triton_time": 0.009824000298976898, "best_triton_kernel": "triton_mm_802", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:45:06.3547631Z AUTOTUNE mm(8x1000, 1000x1280) 2025-09-07T09:45:06.3548054Z strides: [1000, 1], [1280, 1] 2025-09-07T09:45:06.3548736Z dtypes: torch.float16, torch.float16 2025-09-07T09:45:06.3549006Z mm 0.0094 ms 100.0% 2025-09-07T09:45:06.3549622Z triton_mm_802 0.0098 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:45:06.3550610Z triton_mm_798 0.0100 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:45:06.3551840Z triton_mm_806 0.0103 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:45:06.3552827Z triton_mm_810 0.0116 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:45:06.3553778Z triton_mm_796 0.0117 ms 80.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:45:06.3554735Z triton_mm_797 0.0118 ms 79.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:45:06.3556065Z triton_mm_801 0.0125 ms 75.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:45:06.3557061Z triton_mm_808 0.0130 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:45:06.3558023Z triton_mm_805 0.0131 ms 71.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:45:06.3558853Z SingleProcess AUTOTUNE benchmarking takes 0.3553 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:45:13.8531628Z W0907 09:45:13.852000 56693 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:45:42.0746988Z pass 2025-09-07T09:45:48.0742465Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:45:48.0744457Z import pynvml # type: ignore[import] 2025-09-07T09:45:51.1312255Z 2025-09-07T09:45:55.2621817Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:45:55.2622178Z loading model: 0it [00:04, ?it/s] 2025-09-07T09:45:55.2622490Z cuda train swin_base_patch4_window7_224 2025-09-07T09:46:39.7096320Z Autotune Choices Stats: 2025-09-07T09:46:39.7097405Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_82", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.024191999807953835, "best_triton_pos": 0} 2025-09-07T09:46:39.9214161Z AUTOTUNE addmm(25088x512, 25088x128, 128x512) 2025-09-07T09:46:39.9214513Z strides: [0, 1], [128, 1], [1, 128] 2025-09-07T09:46:39.9214835Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:46:39.9216120Z triton_mm_82 0.0242 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:39.9217140Z triton_mm_89 0.0250 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:39.9218159Z triton_mm_86 0.0250 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:39.9219616Z triton_mm_90 0.0251 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:39.9220568Z triton_mm_83 0.0254 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:39.9221900Z triton_mm_84 0.0256 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:39.9222853Z triton_mm_87 0.0258 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:39.9223883Z triton_mm_79 0.0261 ms 92.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:39.9224827Z triton_mm_85 0.0265 ms 91.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:39.9225878Z triton_mm_88 0.0270 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:46:39.9226665Z SingleProcess AUTOTUNE benchmarking takes 0.4910 seconds and 0.0004 seconds precompiling for 21 choices 2025-09-07T09:46:40.5183589Z Autotune Choices Stats: 2025-09-07T09:46:40.5185246Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.01651199907064438, "best_triton_pos": 1, "best_triton_time": 0.017664000391960144, "best_triton_kernel": "triton_mm_311", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:46:40.6229445Z AUTOTUNE addmm(6272x1024, 6272x256, 256x1024) 2025-09-07T09:46:40.6229812Z strides: [0, 1], [256, 1], [1, 256] 2025-09-07T09:46:40.6230140Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:46:40.6230477Z bias_addmm 0.0165 ms 100.0% 2025-09-07T09:46:40.6231138Z triton_mm_311 0.0177 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:40.6232116Z triton_mm_310 0.0201 ms 82.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:40.6233069Z triton_mm_309 0.0202 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:40.6234165Z triton_mm_317 0.0205 ms 80.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:40.6235397Z triton_mm_318 0.0212 ms 77.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:40.6236368Z triton_mm_307 0.0215 ms 76.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:40.6237320Z triton_mm_306 0.0222 ms 74.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:40.6238294Z triton_mm_312 0.0252 ms 65.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:40.6239610Z triton_mm_316 0.0262 ms 62.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:40.6240462Z SingleProcess AUTOTUNE benchmarking takes 0.3910 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T09:46:40.9863160Z Autotune Choices Stats: 2025-09-07T09:46:40.9864851Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_1", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.017152000218629837, "best_triton_pos": 0} 2025-09-07T09:46:41.0908954Z AUTOTUNE convolution(8x3x224x224, 128x3x4x4) 2025-09-07T09:46:41.0909329Z strides: [150528, 50176, 224, 1], [48, 16, 4, 1] 2025-09-07T09:46:41.0909685Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:41.0910490Z triton_convolution2d_1 0.0172 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:46:41.0911789Z triton_convolution2d_6 0.0172 ms 99.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:46:41.0913062Z triton_convolution2d_0 0.0176 ms 97.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:46:41.0914281Z triton_convolution2d_3 0.0176 ms 97.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:46:41.0916224Z triton_convolution2d_5 0.0185 ms 92.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:46:41.0917457Z triton_convolution2d_4 0.0193 ms 88.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:46:41.0918221Z convolution 0.0270 ms 63.5% 2025-09-07T09:46:41.0918958Z triton_convolution2d_2 0.0590 ms 29.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:46:41.0919932Z SingleProcess AUTOTUNE benchmarking takes 0.2039 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:46:41.6391405Z Autotune Choices Stats: 2025-09-07T09:46:41.6392739Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.013887999579310417, "best_triton_pos": 1, "best_triton_time": 0.014208000153303146, "best_triton_kernel": "triton_mm_544", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:46:41.7971925Z AUTOTUNE addmm(1568x2048, 1568x512, 512x2048) 2025-09-07T09:46:41.7972276Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T09:46:41.7972596Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:46:41.7972942Z bias_addmm 0.0139 ms 100.0% 2025-09-07T09:46:41.7973568Z triton_mm_544 0.0142 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:41.7974700Z triton_mm_538 0.0159 ms 87.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:41.7976363Z triton_mm_543 0.0160 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:41.7977348Z triton_mm_536 0.0164 ms 84.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:41.7978585Z triton_mm_540 0.0172 ms 80.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:41.7979559Z triton_mm_545 0.0178 ms 78.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:41.7980521Z triton_mm_534 0.0183 ms 76.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:41.7981563Z triton_mm_533 0.0194 ms 71.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:41.7982515Z triton_mm_537 0.0195 ms 71.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:41.7983354Z SingleProcess AUTOTUNE benchmarking takes 0.4279 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T09:46:42.9479919Z Autotune Choices Stats: 2025-09-07T09:46:42.9480949Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_207", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.0197759997099638, "best_triton_pos": 0} 2025-09-07T09:46:43.0228662Z AUTOTUNE mm(25088x512, 512x128) 2025-09-07T09:46:43.0228962Z strides: [512, 1], [1, 512] 2025-09-07T09:46:43.0229220Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:43.0229896Z triton_mm_207 0.0198 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:43.0230912Z triton_mm_213 0.0205 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:43.0231879Z triton_mm_214 0.0226 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:43.0232868Z triton_mm_208 0.0234 ms 84.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:43.0233829Z triton_mm_203 0.0236 ms 83.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:43.0235262Z triton_mm_206 0.0241 ms 82.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:43.0236235Z triton_mm_212 0.0242 ms 81.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:43.0237193Z triton_mm_209 0.0245 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:43.0238147Z triton_mm_205 0.0248 ms 79.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:43.0239505Z triton_mm_210 0.0249 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:43.0240349Z SingleProcess AUTOTUNE benchmarking takes 0.6674 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:46:43.7274234Z Autotune Choices Stats: 2025-09-07T09:46:43.7276803Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.013728000223636627, "best_triton_pos": 1, "best_triton_time": 0.015039999969303608, "best_triton_kernel": "triton_mm_2436", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T09:46:43.7768976Z AUTOTUNE addmm(392x4096, 392x1024, 1024x4096) 2025-09-07T09:46:43.7769398Z strides: [0, 1], [1024, 1], [1, 1024] 2025-09-07T09:46:43.7769782Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:46:43.7770123Z bias_addmm 0.0137 ms 100.0% 2025-09-07T09:46:43.7770833Z triton_mm_2436 0.0150 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:43.7783174Z triton_mm_2429 0.0161 ms 85.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:43.7784271Z triton_mm_2435 0.0167 ms 82.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:43.7785474Z triton_mm_2425 0.0167 ms 82.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:43.7786400Z triton_mm_2430 0.0181 ms 75.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:43.7787249Z triton_mm_2428 0.0184 ms 74.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:43.7788086Z triton_mm_2432 0.0184 ms 74.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:43.7788636Z addmm 0.0190 ms 72.3% 2025-09-07T09:46:43.7789156Z triton_mm_2427 0.0212 ms 64.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:43.7789902Z SingleProcess AUTOTUNE benchmarking takes 0.3156 seconds and 0.0003 seconds precompiling for 21 choices 2025-09-07T09:46:44.2956077Z Autotune Choices Stats: 2025-09-07T09:46:44.2957228Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_441", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.015424000099301338, "best_triton_pos": 0} 2025-09-07T09:46:44.3669573Z AUTOTUNE mm(6272x1024, 1024x256) 2025-09-07T09:46:44.3669870Z strides: [1024, 1], [1, 1024] 2025-09-07T09:46:44.3670150Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:44.3670807Z triton_mm_441 0.0154 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:44.3671791Z triton_mm_434 0.0166 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:44.3673097Z triton_mm_430 0.0169 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:44.3674006Z triton_mm_440 0.0177 ms 87.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:44.3674909Z triton_mm_435 0.0184 ms 84.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:44.3676288Z triton_mm_433 0.0191 ms 80.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:44.3677270Z triton_mm_437 0.0196 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:44.3678237Z triton_mm_431 0.0200 ms 77.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:44.3679202Z triton_mm_432 0.0220 ms 70.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:44.3680166Z triton_mm_436 0.0224 ms 68.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:44.3681013Z SingleProcess AUTOTUNE benchmarking takes 0.3094 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:46:44.8996773Z Autotune Choices Stats: 2025-09-07T09:46:44.8997845Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_2326", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.015072000212967396, "best_triton_pos": 0} 2025-09-07T09:46:44.9788008Z AUTOTUNE mm(1568x2048, 2048x512) 2025-09-07T09:46:44.9788308Z strides: [2048, 1], [1, 2048] 2025-09-07T09:46:44.9788572Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:44.9789291Z triton_mm_2326 0.0151 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:44.9790325Z triton_mm_2332 0.0185 ms 81.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:44.9791309Z triton_mm_2322 0.0191 ms 79.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:44.9792279Z triton_mm_2321 0.0207 ms 72.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:44.9793242Z triton_mm_2325 0.0210 ms 71.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:44.9794210Z triton_mm_2331 0.0224 ms 67.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:44.9795606Z triton_mm_2324 0.0241 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:44.9796569Z triton_mm_2328 0.0244 ms 61.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:44.9797936Z triton_mm_2318 0.0269 ms 55.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:44.9798895Z triton_mm_2315 0.0274 ms 55.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:44.9799737Z SingleProcess AUTOTUNE benchmarking takes 0.3280 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:46:47.4460674Z Autotune Choices Stats: 2025-09-07T09:46:47.4462648Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.015615999698638916, "best_triton_pos": 1, "best_triton_time": 0.018144000321626663, "best_triton_kernel": "triton_mm_23", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:46:47.4785439Z AUTOTUNE mm(25088x128, 128x384) 2025-09-07T09:46:47.4785899Z strides: [128, 1], [1, 128] 2025-09-07T09:46:47.4786358Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:47.4786870Z mm 0.0156 ms 100.0% 2025-09-07T09:46:47.4787730Z triton_mm_23 0.0181 ms 86.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:47.4788708Z triton_mm_20 0.0187 ms 83.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:47.4789668Z triton_mm_16 0.0187 ms 83.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:47.4790611Z triton_mm_18 0.0197 ms 79.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:47.4791564Z triton_mm_17 0.0199 ms 78.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:47.4792519Z triton_mm_24 0.0203 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:47.4793474Z triton_mm_21 0.0208 ms 75.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:47.4794426Z triton_mm_19 0.0217 ms 72.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:47.4795519Z triton_mm_13 0.0220 ms 71.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:47.4796352Z SingleProcess AUTOTUNE benchmarking takes 0.2616 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:46:48.0018484Z Autotune Choices Stats: 2025-09-07T09:46:48.0019541Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_38", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8", "best_time": 0.019071999937295914, "best_triton_pos": 0} 2025-09-07T09:46:48.1248583Z AUTOTUNE bmm(2048x49x32, 2048x32x49) 2025-09-07T09:46:48.1248905Z strides: [1600, 32, 1], [1600, 49, 1] 2025-09-07T09:46:48.1249195Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:48.1249870Z triton_bmm_38 0.0191 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:46:48.1250868Z triton_bmm_39 0.0193 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:48.1252260Z triton_bmm_33 0.0193 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:48.1253210Z triton_bmm_36 0.0193 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:48.1254395Z triton_bmm_32 0.0195 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:48.1255802Z triton_bmm_37 0.0196 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:48.1256737Z triton_bmm_34 0.0197 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:48.1257569Z triton_bmm_35 0.0197 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:48.1258390Z triton_bmm_30 0.0210 ms 91.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:48.1259212Z triton_bmm_28 0.0212 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:48.1259938Z SingleProcess AUTOTUNE benchmarking takes 0.6446 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:46:48.2922662Z Autotune Choices Stats: 2025-09-07T09:46:48.2923718Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_51", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.0161920003592968, "best_triton_pos": 0} 2025-09-07T09:46:48.3444436Z AUTOTUNE bmm(2048x49x49, 2048x49x32) 2025-09-07T09:46:48.3444770Z strides: [2432, 49, 1], [1600, 32, 1] 2025-09-07T09:46:48.3445256Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:48.3445956Z triton_bmm_51 0.0162 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:48.3446965Z triton_bmm_50 0.0163 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:48.3447938Z triton_bmm_47 0.0163 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:48.3448932Z triton_bmm_44 0.0164 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:48.3449907Z triton_bmm_53 0.0166 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:48.3450864Z triton_bmm_41 0.0176 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:48.3451816Z triton_bmm_46 0.0184 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:48.3452757Z triton_bmm_49 0.0187 ms 86.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:48.3454079Z triton_bmm_40 0.0188 ms 86.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:46:48.3455194Z triton_bmm_42 0.0188 ms 86.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:48.3456266Z SingleProcess AUTOTUNE benchmarking takes 0.2188 seconds and 0.0004 seconds precompiling for 15 choices 2025-09-07T09:46:48.5666946Z Autotune Choices Stats: 2025-09-07T09:46:48.5668030Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_65", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.01065600011497736, "best_triton_pos": 0} 2025-09-07T09:46:48.7583519Z AUTOTUNE mm(25088x128, 128x128) 2025-09-07T09:46:48.7584001Z strides: [128, 1], [1, 128] 2025-09-07T09:46:48.7584452Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:48.7585804Z triton_mm_65 0.0107 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:48.7587533Z triton_mm_63 0.0112 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:48.7588555Z triton_mm_64 0.0113 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:48.7589454Z triton_mm_67 0.0113 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:48.7590346Z triton_mm_60 0.0114 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:48.7591237Z triton_mm_70 0.0114 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:48.7592136Z triton_mm_71 0.0114 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:48.7593022Z triton_mm_68 0.0116 ms 92.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:48.7593578Z mm 0.0118 ms 90.0% 2025-09-07T09:46:48.7594095Z triton_mm_66 0.0118 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:48.7594882Z SingleProcess AUTOTUNE benchmarking takes 0.4129 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T09:46:48.9845292Z Autotune Choices Stats: 2025-09-07T09:46:48.9846660Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01142400037497282, "best_triton_pos": 1, "best_triton_time": 0.011680000461637974, "best_triton_kernel": "triton_mm_233", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T09:46:49.1987278Z AUTOTUNE mm(6272x512, 512x256) 2025-09-07T09:46:49.1987620Z strides: [512, 1], [1, 512] 2025-09-07T09:46:49.1987880Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:49.1988143Z mm 0.0114 ms 100.0% 2025-09-07T09:46:49.1989052Z triton_mm_233 0.0117 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:49.1990056Z triton_mm_226 0.0121 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:49.1991030Z triton_mm_222 0.0124 ms 92.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:49.1992205Z triton_mm_232 0.0126 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:49.1993173Z triton_mm_225 0.0132 ms 86.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:49.1994143Z triton_mm_229 0.0134 ms 85.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:49.1995448Z triton_mm_227 0.0143 ms 79.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:49.1996439Z triton_mm_224 0.0144 ms 79.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:49.1997364Z triton_mm_228 0.0146 ms 78.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:49.1998064Z SingleProcess AUTOTUNE benchmarking takes 0.4342 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:46:49.4267351Z Autotune Choices Stats: 2025-09-07T09:46:49.4268747Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.013088000006973743, "best_triton_pos": 1, "best_triton_time": 0.013824000023305416, "best_triton_kernel": "triton_mm_250", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:46:49.6558103Z AUTOTUNE mm(6272x256, 256x768) 2025-09-07T09:46:49.6558426Z strides: [256, 1], [1, 256] 2025-09-07T09:46:49.6558693Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:49.6558992Z mm 0.0131 ms 100.0% 2025-09-07T09:46:49.6559621Z triton_mm_250 0.0138 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:49.6560657Z triton_mm_245 0.0150 ms 87.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:49.6561656Z triton_mm_251 0.0153 ms 85.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:49.6562634Z triton_mm_243 0.0155 ms 84.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:49.6563598Z triton_mm_247 0.0156 ms 84.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:49.6564570Z triton_mm_244 0.0160 ms 82.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:49.6565756Z triton_mm_248 0.0160 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:49.6567135Z triton_mm_252 0.0173 ms 75.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:49.6567976Z triton_mm_241 0.0175 ms 74.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:49.6568896Z SingleProcess AUTOTUNE benchmarking takes 0.4553 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T09:46:49.7885282Z Autotune Choices Stats: 2025-09-07T09:46:49.7886331Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_260", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.012992000207304955, "best_triton_pos": 0} 2025-09-07T09:46:49.8891538Z AUTOTUNE bmm(1024x49x32, 1024x32x49) 2025-09-07T09:46:49.8891849Z strides: [1600, 32, 1], [1600, 49, 1] 2025-09-07T09:46:49.8892126Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:49.8892841Z triton_bmm_260 0.0130 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:49.8893832Z triton_bmm_265 0.0131 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:46:49.8894805Z triton_bmm_259 0.0131 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:49.8895962Z triton_bmm_263 0.0131 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:49.8896944Z triton_bmm_264 0.0131 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:49.8897775Z triton_bmm_261 0.0131 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:49.8898611Z triton_bmm_266 0.0131 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:49.8899438Z triton_bmm_262 0.0132 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:49.8900262Z triton_bmm_255 0.0137 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:49.8901103Z triton_bmm_257 0.0140 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:49.8901913Z SingleProcess AUTOTUNE benchmarking takes 0.2319 seconds and 0.0003 seconds precompiling for 15 choices 2025-09-07T09:46:50.0206204Z Autotune Choices Stats: 2025-09-07T09:46:50.0207222Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_278", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.01104000024497509, "best_triton_pos": 0} 2025-09-07T09:46:50.1233134Z AUTOTUNE bmm(1024x49x49, 1024x49x32) 2025-09-07T09:46:50.1233471Z strides: [2432, 49, 1], [1600, 32, 1] 2025-09-07T09:46:50.1233754Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:50.1234878Z triton_bmm_278 0.0110 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:50.1236212Z triton_bmm_271 0.0111 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:50.1237200Z triton_bmm_277 0.0115 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:50.1238387Z triton_bmm_274 0.0117 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:50.1239282Z triton_bmm_280 0.0117 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:50.1240185Z triton_bmm_268 0.0124 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:50.1241088Z triton_bmm_269 0.0126 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:50.1242018Z triton_bmm_273 0.0127 ms 87.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:50.1242922Z triton_bmm_279 0.0127 ms 87.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:46:50.1243835Z triton_bmm_267 0.0128 ms 86.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:46:50.1244630Z SingleProcess AUTOTUNE benchmarking takes 0.2335 seconds and 0.0003 seconds precompiling for 15 choices 2025-09-07T09:46:50.3466379Z Autotune Choices Stats: 2025-09-07T09:46:50.3467557Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_292", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.009824000298976898, "best_triton_pos": 0} 2025-09-07T09:46:50.3546642Z AUTOTUNE mm(6272x256, 256x256) 2025-09-07T09:46:50.3546938Z strides: [256, 1], [1, 256] 2025-09-07T09:46:50.3547255Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:50.3548223Z triton_mm_292 0.0098 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:50.3549479Z triton_mm_288 0.0100 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:50.3550106Z mm 0.0100 ms 97.8% 2025-09-07T09:46:50.3550677Z triton_mm_299 0.0102 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:50.3551637Z triton_mm_295 0.0104 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:50.3552611Z triton_mm_298 0.0107 ms 91.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:50.3553575Z triton_mm_291 0.0108 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:50.3554885Z triton_mm_294 0.0108 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:50.3556182Z triton_mm_290 0.0109 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:50.3557357Z triton_mm_297 0.0113 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:50.3558116Z SingleProcess AUTOTUNE benchmarking takes 0.2305 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T09:46:50.5855420Z Autotune Choices Stats: 2025-09-07T09:46:50.5856698Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01119999960064888, "best_triton_pos": 1, "best_triton_time": 0.011455999687314034, "best_triton_kernel": "triton_mm_454", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:46:50.5886135Z AUTOTUNE mm(1568x1024, 1024x512) 2025-09-07T09:46:50.5886406Z strides: [1024, 1], [1, 1024] 2025-09-07T09:46:50.5886674Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:50.5886937Z mm 0.0112 ms 100.0% 2025-09-07T09:46:50.5887545Z triton_mm_454 0.0115 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:50.5888521Z triton_mm_460 0.0131 ms 85.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:50.5889483Z triton_mm_449 0.0132 ms 85.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:50.5890448Z triton_mm_450 0.0133 ms 84.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:50.5891416Z triton_mm_453 0.0136 ms 82.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:50.5892381Z triton_mm_459 0.0148 ms 75.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:50.5893334Z triton_mm_452 0.0149 ms 75.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:50.5894291Z triton_mm_456 0.0150 ms 74.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:50.5895443Z triton_mm_443 0.0170 ms 65.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:50.5896305Z SingleProcess AUTOTUNE benchmarking takes 0.2270 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:46:50.8290111Z Autotune Choices Stats: 2025-09-07T09:46:50.8291417Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.011392000131309032, "best_triton_pos": 1, "best_triton_time": 0.012768000364303589, "best_triton_kernel": "triton_mm_472", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:46:51.0581963Z AUTOTUNE mm(1568x512, 512x1536) 2025-09-07T09:46:51.0582247Z strides: [512, 1], [1, 512] 2025-09-07T09:46:51.0582984Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:51.0583258Z mm 0.0114 ms 100.0% 2025-09-07T09:46:51.0583859Z triton_mm_472 0.0128 ms 89.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:51.0584839Z triton_mm_478 0.0134 ms 85.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:51.0586371Z triton_mm_470 0.0138 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:51.0587415Z triton_mm_474 0.0142 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:51.0588492Z triton_mm_477 0.0145 ms 78.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:51.0589489Z triton_mm_471 0.0146 ms 78.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:51.0590453Z triton_mm_475 0.0147 ms 77.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:51.0591414Z triton_mm_468 0.0152 ms 74.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:51.0592379Z triton_mm_479 0.0154 ms 74.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:51.0593226Z SingleProcess AUTOTUNE benchmarking takes 0.4682 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:46:51.1994584Z Autotune Choices Stats: 2025-09-07T09:46:51.1995929Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_489", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.00979200005531311, "best_triton_pos": 0} 2025-09-07T09:46:51.2906599Z AUTOTUNE bmm(512x49x32, 512x32x49) 2025-09-07T09:46:51.2907032Z strides: [1600, 32, 1], [1600, 49, 1] 2025-09-07T09:46:51.2907419Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:51.2908267Z triton_bmm_489 0.0098 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:51.2909302Z triton_bmm_488 0.0100 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:51.2910283Z triton_bmm_491 0.0100 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:51.2911243Z triton_bmm_484 0.0100 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:51.2912203Z triton_bmm_486 0.0100 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:51.2913172Z triton_bmm_487 0.0102 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:51.2914131Z triton_bmm_490 0.0102 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:51.2915564Z triton_bmm_492 0.0103 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:46:51.2916529Z triton_bmm_493 0.0103 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:51.2917661Z triton_bmm_480 0.0104 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:46:51.2918418Z SingleProcess AUTOTUNE benchmarking takes 0.2310 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:46:51.4306052Z Autotune Choices Stats: 2025-09-07T09:46:51.4307185Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_507", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.009056000038981438, "best_triton_pos": 0} 2025-09-07T09:46:51.5187700Z AUTOTUNE bmm(512x49x49, 512x49x32) 2025-09-07T09:46:51.5188023Z strides: [2432, 49, 1], [1600, 32, 1] 2025-09-07T09:46:51.5188365Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:51.5189069Z triton_bmm_507 0.0091 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:51.5190054Z triton_bmm_498 0.0091 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:51.5191023Z triton_bmm_504 0.0091 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:51.5191981Z triton_bmm_501 0.0092 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:51.5192938Z triton_bmm_505 0.0093 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:51.5193904Z triton_bmm_495 0.0094 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:51.5194873Z triton_bmm_496 0.0094 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:51.5196112Z triton_bmm_503 0.0095 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:51.5197093Z triton_bmm_506 0.0096 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:46:51.5198026Z triton_bmm_497 0.0098 ms 92.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:51.5198797Z SingleProcess AUTOTUNE benchmarking takes 0.2272 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:46:51.7519858Z Autotune Choices Stats: 2025-09-07T09:46:51.7521107Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.009503999724984169, "best_triton_pos": 1, "best_triton_time": 0.009631999768316746, "best_triton_kernel": "triton_mm_520", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:46:51.7547463Z AUTOTUNE mm(1568x512, 512x512) 2025-09-07T09:46:51.7547778Z strides: [512, 1], [1, 512] 2025-09-07T09:46:51.7548093Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:51.7548398Z mm 0.0095 ms 100.0% 2025-09-07T09:46:51.7548992Z triton_mm_520 0.0096 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:51.7550224Z triton_mm_515 0.0102 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:51.7551191Z triton_mm_519 0.0105 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:51.7552142Z triton_mm_518 0.0109 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:51.7553113Z triton_mm_526 0.0109 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:51.7554101Z triton_mm_522 0.0111 ms 85.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:51.7555237Z triton_mm_525 0.0112 ms 84.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:51.7556206Z triton_mm_516 0.0116 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:51.7557164Z triton_mm_517 0.0124 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:51.7557998Z SingleProcess AUTOTUNE benchmarking takes 0.2354 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:46:52.1123833Z Autotune Choices Stats: 2025-09-07T09:46:52.1125287Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_2341", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.012896000407636166, "best_triton_pos": 0} 2025-09-07T09:46:52.2193028Z AUTOTUNE mm(392x2048, 2048x1024) 2025-09-07T09:46:52.2193315Z strides: [2048, 1], [1, 2048] 2025-09-07T09:46:52.2193566Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:52.2194215Z triton_mm_2341 0.0129 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:52.2194851Z mm 0.0129 ms 99.8% 2025-09-07T09:46:52.2195737Z triton_mm_2345 0.0145 ms 89.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:52.2196726Z triton_mm_2337 0.0169 ms 76.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:52.2197708Z triton_mm_2351 0.0180 ms 71.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:52.2198639Z triton_mm_2344 0.0203 ms 63.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:52.2199825Z triton_mm_2340 0.0204 ms 63.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:52.2200715Z triton_mm_2334 0.0212 ms 60.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:52.2201588Z triton_mm_2336 0.0218 ms 59.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:52.2202681Z triton_mm_2350 0.0224 ms 57.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:52.2203465Z SingleProcess AUTOTUNE benchmarking takes 0.3511 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:46:52.4546432Z Autotune Choices Stats: 2025-09-07T09:46:52.4547709Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.012703999876976013, "best_triton_pos": 1, "best_triton_time": 0.013567999936640263, "best_triton_kernel": "triton_mm_2370", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T09:46:52.6864664Z AUTOTUNE mm(392x1024, 1024x3072) 2025-09-07T09:46:52.6865120Z strides: [1024, 1], [1, 1024] 2025-09-07T09:46:52.6865399Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:52.6865690Z mm 0.0127 ms 100.0% 2025-09-07T09:46:52.6866327Z triton_mm_2370 0.0136 ms 93.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:52.6867445Z triton_mm_2359 0.0149 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:52.6868431Z triton_mm_2363 0.0150 ms 84.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:52.6869385Z triton_mm_2369 0.0156 ms 81.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:52.6870359Z triton_mm_2364 0.0167 ms 75.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:52.6871322Z triton_mm_2366 0.0170 ms 74.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:52.6872275Z triton_mm_2362 0.0172 ms 74.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:52.6873232Z triton_mm_2360 0.0181 ms 70.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:52.6874184Z triton_mm_2365 0.0198 ms 64.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:52.6875160Z SingleProcess AUTOTUNE benchmarking takes 0.4664 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:46:52.8288248Z Autotune Choices Stats: 2025-09-07T09:46:52.8289238Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_2381", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.007935999892652035, "best_triton_pos": 0} 2025-09-07T09:46:53.0280356Z AUTOTUNE bmm(256x49x32, 256x32x49) 2025-09-07T09:46:53.0280649Z strides: [1600, 32, 1], [1600, 49, 1] 2025-09-07T09:46:53.0280931Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:53.0281611Z triton_bmm_2381 0.0079 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:53.0283084Z triton_bmm_2383 0.0080 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:46:53.0284069Z triton_bmm_2380 0.0080 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:53.0285235Z triton_bmm_2378 0.0081 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:53.0286217Z triton_bmm_2384 0.0081 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:53.0287213Z triton_bmm_2382 0.0082 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:53.0288244Z triton_bmm_2377 0.0082 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:53.0289147Z triton_bmm_2374 0.0083 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:53.0290065Z triton_bmm_2379 0.0083 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:53.0290963Z triton_bmm_2371 0.0083 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:46:53.0291746Z SingleProcess AUTOTUNE benchmarking takes 0.3398 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:46:53.1673639Z Autotune Choices Stats: 2025-09-07T09:46:53.1674626Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_2392", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.007679999805986881, "best_triton_pos": 0} 2025-09-07T09:46:53.2615182Z AUTOTUNE bmm(256x49x49, 256x49x32) 2025-09-07T09:46:53.2615472Z strides: [2432, 49, 1], [1600, 32, 1] 2025-09-07T09:46:53.2615756Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:53.2616450Z triton_bmm_2392 0.0077 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:53.2617493Z triton_bmm_2398 0.0077 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:53.2618498Z triton_bmm_2386 0.0079 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:53.2619486Z triton_bmm_2395 0.0079 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:53.2620455Z triton_bmm_2396 0.0080 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:53.2621817Z triton_bmm_2387 0.0080 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:53.2622809Z triton_bmm_2391 0.0081 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:53.2623957Z triton_bmm_2397 0.0081 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:46:53.2625104Z triton_bmm_2394 0.0082 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:53.2626086Z triton_bmm_2389 0.0083 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:53.2626952Z SingleProcess AUTOTUNE benchmarking takes 0.2329 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:46:53.6020087Z Autotune Choices Stats: 2025-09-07T09:46:53.6021069Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_2407", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.009696000255644321, "best_triton_pos": 0} 2025-09-07T09:46:53.7957298Z AUTOTUNE mm(392x1024, 1024x1024) 2025-09-07T09:46:53.7957675Z strides: [1024, 1], [1, 1024] 2025-09-07T09:46:53.7958002Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:53.7958780Z triton_mm_2407 0.0097 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:53.7959434Z mm 0.0101 ms 96.2% 2025-09-07T09:46:53.7960060Z triton_mm_2411 0.0112 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:53.7961045Z triton_mm_2403 0.0124 ms 78.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:53.7962012Z triton_mm_2406 0.0126 ms 76.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:53.7963002Z triton_mm_2417 0.0128 ms 75.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:53.7963979Z triton_mm_2410 0.0131 ms 73.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:53.7965369Z triton_mm_2400 0.0142 ms 68.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:53.7966401Z triton_mm_2413 0.0144 ms 67.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:46:53.7967470Z triton_mm_2416 0.0144 ms 67.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:53.7968302Z SingleProcess AUTOTUNE benchmarking takes 0.5334 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:46:54.0754922Z Autotune Choices Stats: 2025-09-07T09:46:54.0757233Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01692800037562847, "best_triton_pos": 1, "best_triton_time": 0.017184000462293625, "best_triton_kernel": "triton_mm_2445", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:46:54.2552226Z AUTOTUNE mm(392x4096, 4096x1024) 2025-09-07T09:46:54.2552491Z strides: [4096, 1], [1, 4096] 2025-09-07T09:46:54.2552750Z dtypes: torch.float16, torch.float16 2025-09-07T09:46:54.2553008Z mm 0.0169 ms 100.0% 2025-09-07T09:46:54.2553933Z triton_mm_2445 0.0172 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:54.2555111Z triton_mm_2449 0.0194 ms 87.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:46:54.2556130Z triton_mm_2441 0.0242 ms 69.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:46:54.2557132Z triton_mm_2455 0.0276 ms 61.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:54.2558101Z triton_mm_2444 0.0335 ms 50.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:46:54.2559025Z triton_mm_2448 0.0338 ms 50.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:54.2559903Z triton_mm_2438 0.0352 ms 48.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:46:54.2560771Z triton_mm_2440 0.0366 ms 46.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:46:54.2561658Z triton_mm_2454 0.0370 ms 45.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:46:54.2562446Z SingleProcess AUTOTUNE benchmarking takes 0.4590 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:40.9491396Z Autotune Choices Stats: 2025-09-07T09:47:40.9492767Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01894400082528591, "best_triton_pos": 1, "best_triton_time": 0.021663999184966087, "best_triton_kernel": "triton_mm_7331", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:47:41.0737796Z AUTOTUNE mm(25088x128, 128x512) 2025-09-07T09:47:41.0738167Z strides: [128, 1], [512, 1] 2025-09-07T09:47:41.0738465Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:41.0738743Z mm 0.0189 ms 100.0% 2025-09-07T09:47:41.0739381Z triton_mm_7331 0.0217 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:41.0740396Z triton_mm_7338 0.0217 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:41.0741506Z triton_mm_7339 0.0224 ms 84.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:41.0742408Z triton_mm_7335 0.0228 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:41.0743758Z triton_mm_7332 0.0230 ms 82.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:41.0744651Z triton_mm_7333 0.0231 ms 82.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:41.0746003Z triton_mm_7336 0.0236 ms 80.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:41.0746916Z triton_mm_7328 0.0236 ms 80.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:47:41.0747807Z triton_mm_7334 0.0241 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:41.0748609Z SingleProcess AUTOTUNE benchmarking takes 0.3528 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:47:41.6264803Z Autotune Choices Stats: 2025-09-07T09:47:41.6266422Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.014112000353634357, "best_triton_pos": 1, "best_triton_time": 0.01600000075995922, "best_triton_kernel": "triton_mm_6883", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:47:41.9092842Z AUTOTUNE mm(6272x256, 256x1024) 2025-09-07T09:47:41.9093091Z strides: [256, 1], [1024, 1] 2025-09-07T09:47:41.9093322Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:41.9093561Z mm 0.0141 ms 100.0% 2025-09-07T09:47:41.9094104Z triton_mm_6883 0.0160 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:41.9095272Z triton_mm_6877 0.0162 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:41.9096177Z triton_mm_6882 0.0172 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:41.9097043Z triton_mm_6875 0.0175 ms 80.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:41.9097919Z triton_mm_6876 0.0177 ms 79.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:41.9098787Z triton_mm_6884 0.0179 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:41.9099686Z triton_mm_6880 0.0181 ms 78.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:41.9100539Z triton_mm_6879 0.0184 ms 76.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:41.9101490Z triton_mm_6873 0.0198 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:41.9102233Z SingleProcess AUTOTUNE benchmarking takes 0.5020 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:44.1074290Z Autotune Choices Stats: 2025-09-07T09:47:44.1076084Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.013472000136971474, "best_triton_pos": 1, "best_triton_time": 0.013887999579310417, "best_triton_kernel": "triton_mm_2646", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:47:44.3364146Z AUTOTUNE mm(1024x392, 392x4096) 2025-09-07T09:47:44.3364387Z strides: [1, 1024], [4096, 1] 2025-09-07T09:47:44.3364633Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:44.3364881Z mm 0.0135 ms 100.0% 2025-09-07T09:47:44.3365906Z triton_mm_2646 0.0139 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:44.3366848Z triton_mm_2643 0.0140 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:44.3367761Z triton_mm_2639 0.0141 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:44.3368682Z triton_mm_2645 0.0141 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:44.3369587Z triton_mm_2638 0.0142 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:44.3370507Z triton_mm_2642 0.0148 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:44.3371400Z triton_mm_2640 0.0158 ms 85.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:44.3372513Z triton_mm_2647 0.0158 ms 85.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:44.3373656Z triton_mm_2635 0.0171 ms 78.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:47:44.3374497Z SingleProcess AUTOTUNE benchmarking takes 0.7584 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:45.4777606Z Autotune Choices Stats: 2025-09-07T09:47:45.4778970Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01375999953597784, "best_triton_pos": 1, "best_triton_time": 0.013919999822974205, "best_triton_kernel": "triton_mm_2676", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:47:45.6524617Z AUTOTUNE mm(4096x392, 392x1024) 2025-09-07T09:47:45.6524931Z strides: [1, 4096], [1024, 1] 2025-09-07T09:47:45.6525543Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:45.6525831Z mm 0.0138 ms 100.0% 2025-09-07T09:47:45.6526478Z triton_mm_2676 0.0139 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:45.6527535Z triton_mm_2677 0.0142 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:45.6528554Z triton_mm_2683 0.0142 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:45.6529567Z triton_mm_2684 0.0144 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:45.6531144Z triton_mm_2681 0.0146 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:45.6532177Z triton_mm_2680 0.0148 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:45.6533396Z triton_mm_2678 0.0158 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:45.6534320Z triton_mm_2685 0.0161 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:45.6535365Z triton_mm_2673 0.0170 ms 81.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:47:45.6536159Z SingleProcess AUTOTUNE benchmarking takes 0.7167 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:46.0891469Z Autotune Choices Stats: 2025-09-07T09:47:46.0892863Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.013024000450968742, "best_triton_pos": 1, "best_triton_time": 0.013439999893307686, "best_triton_kernel": "triton_mm_3083", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:47:46.2644031Z AUTOTUNE mm(1568x512, 512x2048) 2025-09-07T09:47:46.2644347Z strides: [512, 1], [2048, 1] 2025-09-07T09:47:46.2644605Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:46.2644876Z mm 0.0130 ms 100.0% 2025-09-07T09:47:46.2645642Z triton_mm_3083 0.0134 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:46.2646660Z triton_mm_3082 0.0142 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:46.2647641Z triton_mm_3075 0.0143 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:46.2648612Z triton_mm_3077 0.0147 ms 88.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:46.2649567Z triton_mm_3079 0.0152 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:46.2650534Z triton_mm_3084 0.0157 ms 83.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:46.2651507Z triton_mm_3076 0.0160 ms 81.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:46.2652535Z triton_mm_3080 0.0168 ms 77.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:46.2653461Z triton_mm_3073 0.0171 ms 76.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:46.2654239Z SingleProcess AUTOTUNE benchmarking takes 0.5437 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:47.1099258Z Autotune Choices Stats: 2025-09-07T09:47:47.1364815Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.011872000060975552, "best_triton_pos": 1, "best_triton_time": 0.012512000277638435, "best_triton_kernel": "triton_mm_2811", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:47:47.1377202Z AUTOTUNE mm(3072x392, 392x1024) 2025-09-07T09:47:47.1377446Z strides: [1, 3072], [1024, 1] 2025-09-07T09:47:47.1377681Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:47.1377915Z mm 0.0119 ms 100.0% 2025-09-07T09:47:47.1378768Z triton_mm_2811 0.0125 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:47.1379713Z triton_mm_2810 0.0126 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:47.1380638Z triton_mm_2814 0.0127 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:47.1381627Z triton_mm_2809 0.0128 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:47.1382554Z triton_mm_2816 0.0131 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:47.1383491Z triton_mm_2817 0.0133 ms 89.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:47.1384393Z triton_mm_2813 0.0136 ms 87.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:47.1385455Z triton_mm_2818 0.0154 ms 77.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:47.1386360Z triton_mm_2806 0.0158 ms 74.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:47:47.1387153Z SingleProcess AUTOTUNE benchmarking takes 0.2603 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:48.8180938Z Autotune Choices Stats: 2025-09-07T09:47:48.8182352Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010495999827980995, "best_triton_pos": 1, "best_triton_time": 0.010623999871313572, "best_triton_kernel": "triton_mm_3046", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T09:47:49.0296586Z AUTOTUNE mm(1024x392, 392x2048) 2025-09-07T09:47:49.0296853Z strides: [1, 1024], [2048, 1] 2025-09-07T09:47:49.0297099Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:49.0297354Z mm 0.0105 ms 100.0% 2025-09-07T09:47:49.0297919Z triton_mm_3046 0.0106 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:49.0298866Z triton_mm_3039 0.0110 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:49.0299774Z triton_mm_3038 0.0112 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:49.0300685Z triton_mm_3045 0.0112 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:49.0302058Z triton_mm_3042 0.0112 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:49.0302959Z triton_mm_3037 0.0118 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:49.0304091Z triton_mm_3035 0.0122 ms 86.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:49.0305300Z triton_mm_3044 0.0123 ms 85.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:49.0306159Z triton_mm_3041 0.0125 ms 83.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:49.0306897Z SingleProcess AUTOTUNE benchmarking takes 1.2588 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:49.5252946Z Autotune Choices Stats: 2025-09-07T09:47:49.5254147Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.013055999763309956, "best_triton_pos": 1, "best_triton_time": 0.013856000266969204, "best_triton_kernel": "triton_mm_2628", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T09:47:49.6145325Z AUTOTUNE mm(392x1024, 1024x4096) 2025-09-07T09:47:49.6145595Z strides: [1024, 1], [4096, 1] 2025-09-07T09:47:49.6145831Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:49.6146057Z mm 0.0131 ms 100.0% 2025-09-07T09:47:49.6146619Z triton_mm_2628 0.0139 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:49.6147510Z triton_mm_2621 0.0154 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:49.6148383Z triton_mm_2627 0.0159 ms 81.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:49.6149224Z triton_mm_2617 0.0164 ms 79.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:49.6150066Z triton_mm_2620 0.0165 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:49.6150908Z triton_mm_2624 0.0169 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:49.6151746Z triton_mm_2622 0.0173 ms 75.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:49.6152595Z triton_mm_2619 0.0190 ms 68.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:49.6153470Z triton_mm_2623 0.0200 ms 65.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:49.6154261Z SingleProcess AUTOTUNE benchmarking takes 0.3850 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:51.0116921Z Autotune Choices Stats: 2025-09-07T09:47:51.0118227Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.00863999966531992, "best_triton_pos": 1, "best_triton_time": 0.009056000038981438, "best_triton_kernel": "triton_mm_2717", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:47:51.1962491Z AUTOTUNE mm(1024x392, 392x1024) 2025-09-07T09:47:51.1962765Z strides: [1, 1024], [1024, 1] 2025-09-07T09:47:51.2031373Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:51.2031773Z mm 0.0086 ms 100.0% 2025-09-07T09:47:51.2032424Z triton_mm_2717 0.0091 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:51.2033454Z triton_mm_2715 0.0094 ms 92.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:51.2034855Z triton_mm_2716 0.0094 ms 92.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:51.2036276Z triton_mm_2719 0.0094 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:51.2037270Z triton_mm_2712 0.0097 ms 88.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:51.2038243Z triton_mm_2723 0.0099 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:51.2039224Z triton_mm_2714 0.0104 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:51.2040214Z triton_mm_2722 0.0104 ms 82.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:51.2041184Z triton_mm_2713 0.0108 ms 80.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:51.2042048Z SingleProcess AUTOTUNE benchmarking takes 0.5725 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:54.1498267Z Autotune Choices Stats: 2025-09-07T09:47:54.1499613Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.012128000147640705, "best_triton_pos": 1, "best_triton_time": 0.014271999709308147, "best_triton_kernel": "triton_mm_3097", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:47:54.1964469Z AUTOTUNE mm(512x1568, 1568x2048) 2025-09-07T09:47:54.1964826Z strides: [1, 512], [2048, 1] 2025-09-07T09:47:54.1965499Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:54.1965801Z mm 0.0121 ms 100.0% 2025-09-07T09:47:54.1966425Z triton_mm_3097 0.0143 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:54.1967481Z triton_mm_3103 0.0174 ms 69.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:54.1968479Z triton_mm_3093 0.0177 ms 68.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:54.1969912Z triton_mm_3096 0.0180 ms 67.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:54.1970888Z triton_mm_3099 0.0182 ms 66.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:54.1972078Z triton_mm_3092 0.0183 ms 66.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:54.1973055Z triton_mm_3095 0.0184 ms 65.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:54.1974032Z triton_mm_3102 0.0196 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:54.1975290Z triton_mm_3094 0.0236 ms 51.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:54.1976090Z SingleProcess AUTOTUNE benchmarking takes 0.6962 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:54.8411242Z Autotune Choices Stats: 2025-09-07T09:47:54.8412963Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.012480000033974648, "best_triton_pos": 1, "best_triton_time": 0.013728000223636627, "best_triton_kernel": "triton_mm_3135", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:47:55.0356865Z AUTOTUNE mm(2048x1568, 1568x512) 2025-09-07T09:47:55.0357198Z strides: [1, 2048], [512, 1] 2025-09-07T09:47:55.0357500Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:55.0357838Z mm 0.0125 ms 100.0% 2025-09-07T09:47:55.0358553Z triton_mm_3135 0.0137 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:55.0359745Z triton_mm_3131 0.0173 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:55.0360934Z triton_mm_3141 0.0173 ms 72.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:55.0362114Z triton_mm_3134 0.0175 ms 71.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:55.0363265Z triton_mm_3130 0.0178 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:55.0364434Z triton_mm_3133 0.0180 ms 69.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:55.0366004Z triton_mm_3137 0.0184 ms 67.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:55.0367191Z triton_mm_3140 0.0194 ms 64.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:55.0368353Z triton_mm_3132 0.0232 ms 53.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:55.0369355Z SingleProcess AUTOTUNE benchmarking takes 0.4127 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:55.6487210Z Autotune Choices Stats: 2025-09-07T09:47:55.6488543Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.012095999903976917, "best_triton_pos": 1, "best_triton_time": 0.013824000023305416, "best_triton_kernel": "triton_mm_3268", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:47:55.6752221Z AUTOTUNE mm(1536x1568, 1568x512) 2025-09-07T09:47:55.6752839Z strides: [1, 1536], [512, 1] 2025-09-07T09:47:55.6753114Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:55.6753408Z mm 0.0121 ms 100.0% 2025-09-07T09:47:55.6754059Z triton_mm_3268 0.0138 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:55.6755546Z triton_mm_3264 0.0172 ms 70.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:55.6756617Z triton_mm_3263 0.0174 ms 69.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:55.6757674Z triton_mm_3274 0.0176 ms 68.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:55.6758735Z triton_mm_3266 0.0176 ms 68.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:55.6759784Z triton_mm_3267 0.0178 ms 68.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:55.6760858Z triton_mm_3270 0.0180 ms 67.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:55.6761916Z triton_mm_3273 0.0196 ms 61.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:55.6762977Z triton_mm_3260 0.0219 ms 55.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:55.6763895Z SingleProcess AUTOTUNE benchmarking takes 0.2453 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:56.2592331Z Autotune Choices Stats: 2025-09-07T09:47:56.2594507Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01071999967098236, "best_triton_pos": 1, "best_triton_time": 0.011455999687314034, "best_triton_kernel": "triton_mm_6836", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:47:56.2909848Z AUTOTUNE mm(512x1568, 1568x1024) 2025-09-07T09:47:56.2910209Z strides: [1, 512], [1024, 1] 2025-09-07T09:47:56.2910573Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:56.2910937Z mm 0.0107 ms 100.0% 2025-09-07T09:47:56.2911883Z triton_mm_6836 0.0115 ms 93.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:56.2913461Z triton_mm_6840 0.0138 ms 77.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:56.2915179Z triton_mm_6832 0.0162 ms 66.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:56.2917016Z triton_mm_6835 0.0169 ms 63.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:56.2918370Z triton_mm_6846 0.0170 ms 63.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:56.2920023Z triton_mm_6839 0.0172 ms 62.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:56.2921377Z triton_mm_6838 0.0173 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:56.2922740Z triton_mm_6842 0.0174 ms 61.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:56.2924076Z triton_mm_6831 0.0179 ms 60.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:56.2925402Z SingleProcess AUTOTUNE benchmarking takes 0.2482 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:56.7191222Z Autotune Choices Stats: 2025-09-07T09:47:56.7192540Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010143999941647053, "best_triton_pos": 1, "best_triton_time": 0.010688000358641148, "best_triton_kernel": "triton_mm_3169", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:47:56.8323508Z AUTOTUNE mm(512x1568, 1568x512) 2025-09-07T09:47:56.8323869Z strides: [1, 512], [512, 1] 2025-09-07T09:47:56.8324140Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:56.8324425Z mm 0.0101 ms 100.0% 2025-09-07T09:47:56.8325601Z triton_mm_3169 0.0107 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:56.8326857Z triton_mm_3165 0.0109 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:56.8327999Z triton_mm_3173 0.0129 ms 78.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:56.8328976Z triton_mm_3164 0.0143 ms 71.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:56.8329944Z triton_mm_3163 0.0150 ms 67.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:56.8330900Z triton_mm_3168 0.0158 ms 64.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:56.8331870Z triton_mm_3179 0.0159 ms 63.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:56.8332848Z triton_mm_3172 0.0161 ms 63.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:56.8333826Z triton_mm_3171 0.0165 ms 61.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:47:56.8335378Z SingleProcess AUTOTUNE benchmarking takes 0.3258 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:57.5086772Z Autotune Choices Stats: 2025-09-07T09:47:57.5088095Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01945599913597107, "best_triton_pos": 1, "best_triton_time": 0.020479999482631683, "best_triton_kernel": "triton_mm_6889", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:47:57.5606690Z AUTOTUNE mm(256x6272, 6272x1024) 2025-09-07T09:47:57.5606983Z strides: [1, 256], [1024, 1] 2025-09-07T09:47:57.5607251Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:57.5607520Z mm 0.0195 ms 100.0% 2025-09-07T09:47:57.5608173Z triton_mm_6889 0.0205 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:57.5609195Z triton_mm_6893 0.0213 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:57.5610175Z triton_mm_6897 0.0252 ms 77.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:57.5611180Z triton_mm_6903 0.0398 ms 48.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:57.5612147Z triton_mm_6888 0.0423 ms 46.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:57.5613105Z triton_mm_6887 0.0440 ms 44.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:57.5614085Z triton_mm_6896 0.0452 ms 43.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:57.5615358Z triton_mm_6892 0.0453 ms 43.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:57.5616432Z triton_mm_6902 0.0511 ms 38.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:57.5617217Z SingleProcess AUTOTUNE benchmarking takes 0.3635 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:57.9120956Z Autotune Choices Stats: 2025-09-07T09:47:57.9122229Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.019807999953627586, "best_triton_pos": 1, "best_triton_time": 0.0208320003002882, "best_triton_kernel": "triton_mm_6927", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:47:57.9737911Z AUTOTUNE mm(1024x6272, 6272x256) 2025-09-07T09:47:57.9738202Z strides: [1, 1024], [256, 1] 2025-09-07T09:47:57.9738464Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:57.9738732Z mm 0.0198 ms 100.0% 2025-09-07T09:47:57.9739369Z triton_mm_6927 0.0208 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:57.9740401Z triton_mm_6931 0.0216 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:57.9741410Z triton_mm_6935 0.0261 ms 75.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:57.9742765Z triton_mm_6941 0.0399 ms 49.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:57.9743739Z triton_mm_6926 0.0424 ms 46.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:57.9744921Z triton_mm_6925 0.0437 ms 45.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:57.9746447Z triton_mm_6930 0.0454 ms 43.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:57.9747399Z triton_mm_6934 0.0458 ms 43.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:57.9748369Z triton_mm_6940 0.0515 ms 38.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:57.9749220Z SingleProcess AUTOTUNE benchmarking takes 0.3691 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:58.3434437Z Autotune Choices Stats: 2025-09-07T09:47:58.3436017Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.016383999958634377, "best_triton_pos": 1, "best_triton_time": 0.019840000197291374, "best_triton_kernel": "triton_mm_7060", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:47:58.3854914Z AUTOTUNE mm(768x6272, 6272x256) 2025-09-07T09:47:58.3855312Z strides: [1, 768], [256, 1] 2025-09-07T09:47:58.3855581Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:58.3855868Z mm 0.0164 ms 100.0% 2025-09-07T09:47:58.3856498Z triton_mm_7060 0.0198 ms 82.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:58.3857513Z triton_mm_7064 0.0211 ms 77.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:58.3858499Z triton_mm_7068 0.0260 ms 63.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:58.3859484Z triton_mm_7074 0.0398 ms 41.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:58.3860465Z triton_mm_7059 0.0410 ms 40.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:58.3861530Z triton_mm_7058 0.0424 ms 38.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:58.3862500Z triton_mm_7063 0.0436 ms 37.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:58.3863460Z triton_mm_7067 0.0447 ms 36.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:58.3864434Z triton_mm_7073 0.0499 ms 32.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:58.3865852Z SingleProcess AUTOTUNE benchmarking takes 0.3490 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:59.1010713Z Autotune Choices Stats: 2025-09-07T09:47:59.1012513Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01616000011563301, "best_triton_pos": 1, "best_triton_time": 0.019807999953627586, "best_triton_kernel": "triton_mm_7288", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:47:59.2629439Z AUTOTUNE mm(256x6272, 6272x512) 2025-09-07T09:47:59.2629714Z strides: [1, 256], [512, 1] 2025-09-07T09:47:59.2629973Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:59.2630245Z mm 0.0162 ms 100.0% 2025-09-07T09:47:59.2630848Z triton_mm_7288 0.0198 ms 81.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:59.2631839Z triton_mm_7292 0.0212 ms 76.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:59.2632801Z triton_mm_7296 0.0247 ms 65.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:59.2633773Z triton_mm_7302 0.0400 ms 40.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:59.2634727Z triton_mm_7287 0.0404 ms 40.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:59.2636034Z triton_mm_7286 0.0422 ms 38.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:59.2636986Z triton_mm_7295 0.0430 ms 37.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:59.2637943Z triton_mm_7291 0.0430 ms 37.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:59.2638842Z triton_mm_7301 0.0493 ms 32.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:59.2639628Z SingleProcess AUTOTUNE benchmarking takes 0.6668 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:47:59.7661550Z Autotune Choices Stats: 2025-09-07T09:47:59.7662867Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.015744000673294067, "best_triton_pos": 1, "best_triton_time": 0.018624000251293182, "best_triton_kernel": "triton_mm_6965", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:47:59.8909016Z AUTOTUNE mm(256x6272, 6272x256) 2025-09-07T09:47:59.8909274Z strides: [1, 256], [256, 1] 2025-09-07T09:47:59.8909526Z dtypes: torch.float16, torch.float16 2025-09-07T09:47:59.8909793Z mm 0.0157 ms 100.0% 2025-09-07T09:47:59.8910415Z triton_mm_6965 0.0186 ms 84.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:59.8911402Z triton_mm_6969 0.0205 ms 76.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:47:59.8912806Z triton_mm_6973 0.0253 ms 62.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:47:59.8913774Z triton_mm_6964 0.0383 ms 41.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:59.8914750Z triton_mm_6979 0.0390 ms 40.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:59.8916302Z triton_mm_6963 0.0401 ms 39.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:47:59.8917255Z triton_mm_6968 0.0408 ms 38.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:47:59.8918183Z triton_mm_6972 0.0423 ms 37.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:59.8919084Z triton_mm_6978 0.0482 ms 32.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:47:59.8919876Z SingleProcess AUTOTUNE benchmarking takes 0.4309 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:04.3594149Z Autotune Choices Stats: 2025-09-07T09:48:04.3596089Z {"num_choices": 29, "num_triton_choices": 19, "best_kernel": "decompose_k_mm_7_split_3", "best_kernel_desc": "k_split=7", "best_time": 0.025119999423623085, "best_triton_pos": 4, "best_triton_time": 0.05708799883723259, "best_triton_kernel": "triton_mm_7345", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:48:04.4038971Z AUTOTUNE mm(128x25088, 25088x512) 2025-09-07T09:48:04.4039292Z strides: [1, 128], [512, 1] 2025-09-07T09:48:04.4039518Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:04.4039779Z decompose_k_mm_7_split_3 0.0251 ms 100.0% k_split=7 2025-09-07T09:48:04.4040044Z mm 0.0258 ms 97.5% 2025-09-07T09:48:04.4040268Z decompose_k_mm_4_split_2 0.0281 ms 89.4% k_split=4 2025-09-07T09:48:04.4040546Z decompose_k_mm_2_split_1 0.0300 ms 83.9% k_split=2 2025-09-07T09:48:04.4041146Z triton_mm_7345 0.0571 ms 44.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:04.4041975Z triton_mm_7349 0.0626 ms 40.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:04.4042508Z decompose_k_mm_8_split_4 0.0641 ms 39.2% k_split=8 2025-09-07T09:48:04.4042816Z decompose_k_mm_14_split_5 0.0642 ms 39.1% k_split=14 2025-09-07T09:48:04.4043099Z decompose_k_mm_16_split_6 0.0643 ms 39.1% k_split=16 2025-09-07T09:48:04.4043367Z decompose_k_mm_28_split_7 0.0652 ms 38.5% k_split=28 2025-09-07T09:48:04.4043809Z SingleProcess AUTOTUNE benchmarking takes 4.3108 seconds and 0.0002 seconds precompiling for 29 choices 2025-09-07T09:48:06.4051234Z Autotune Choices Stats: 2025-09-07T09:48:06.4052742Z {"num_choices": 29, "num_triton_choices": 19, "best_kernel": "decompose_k_mm_7_split_12", "best_kernel_desc": "k_split=7", "best_time": 0.02502400055527687, "best_triton_pos": 4, "best_triton_time": 0.05724800005555153, "best_triton_kernel": "triton_mm_7383", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:48:06.4502017Z AUTOTUNE mm(512x25088, 25088x128) 2025-09-07T09:48:06.4502791Z strides: [1, 512], [128, 1] 2025-09-07T09:48:06.4503066Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:06.4503408Z decompose_k_mm_7_split_12 0.0250 ms 100.0% k_split=7 2025-09-07T09:48:06.4503727Z mm 0.0258 ms 96.9% 2025-09-07T09:48:06.4504004Z decompose_k_mm_4_split_11 0.0282 ms 88.8% k_split=4 2025-09-07T09:48:06.4504349Z decompose_k_mm_2_split_10 0.0298 ms 84.0% k_split=2 2025-09-07T09:48:06.4505420Z triton_mm_7383 0.0572 ms 43.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:06.4506744Z triton_mm_7387 0.0629 ms 39.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:06.4507438Z decompose_k_mm_16_split_15 0.0640 ms 39.1% k_split=16 2025-09-07T09:48:06.4507790Z decompose_k_mm_14_split_14 0.0641 ms 39.0% k_split=14 2025-09-07T09:48:06.4508160Z decompose_k_mm_8_split_13 0.0645 ms 38.8% k_split=8 2025-09-07T09:48:06.4508500Z decompose_k_mm_28_split_16 0.0651 ms 38.4% k_split=28 2025-09-07T09:48:06.4509097Z SingleProcess AUTOTUNE benchmarking takes 1.8173 seconds and 0.0002 seconds precompiling for 29 choices 2025-09-07T09:48:09.5031672Z Autotune Choices Stats: 2025-09-07T09:48:09.5033168Z {"num_choices": 30, "num_triton_choices": 19, "best_kernel": "decompose_k_mm_7_split_31", "best_kernel_desc": "k_split=7", "best_time": 0.02271999977529049, "best_triton_pos": 11, "best_triton_time": 0.05558399856090546, "best_triton_kernel": "triton_mm_7516", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:48:09.8590050Z AUTOTUNE mm(384x25088, 25088x128) 2025-09-07T09:48:09.8590326Z strides: [1, 384], [128, 1] 2025-09-07T09:48:09.8590617Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:09.8590980Z decompose_k_mm_7_split_31 0.0227 ms 100.0% k_split=7 2025-09-07T09:48:09.8591364Z decompose_k_mm_4_split_30 0.0231 ms 98.2% k_split=4 2025-09-07T09:48:09.8591670Z mm 0.0240 ms 94.5% 2025-09-07T09:48:09.8591920Z decompose_k_mm_2_split_29 0.0277 ms 82.0% k_split=2 2025-09-07T09:48:09.8592291Z decompose_k_mm_14_split_33 0.0514 ms 44.2% k_split=14 2025-09-07T09:48:09.8592672Z decompose_k_mm_8_split_32 0.0515 ms 44.1% k_split=8 2025-09-07T09:48:09.8593026Z decompose_k_mm_16_split_34 0.0519 ms 43.8% k_split=16 2025-09-07T09:48:09.8593369Z decompose_k_mm_28_split_35 0.0523 ms 43.5% k_split=28 2025-09-07T09:48:09.8593713Z decompose_k_mm_32_split_37 0.0526 ms 43.2% k_split=32 2025-09-07T09:48:09.8594048Z decompose_k_mm_49_split_28 0.0528 ms 43.0% k_split=49 2025-09-07T09:48:09.8594596Z SingleProcess AUTOTUNE benchmarking takes 2.9553 seconds and 0.0002 seconds precompiling for 30 choices 2025-09-07T09:48:13.1934374Z Autotune Choices Stats: 2025-09-07T09:48:13.1936794Z {"num_choices": 30, "num_triton_choices": 19, "best_kernel": "decompose_k_mm_7_split_23", "best_kernel_desc": "k_split=7", "best_time": 0.016543999314308167, "best_triton_pos": 11, "best_triton_time": 0.05488000065088272, "best_triton_kernel": "triton_mm_7421", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:48:13.3802878Z AUTOTUNE mm(128x25088, 25088x128) 2025-09-07T09:48:13.3803356Z strides: [1, 128], [128, 1] 2025-09-07T09:48:13.3803759Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:13.3804291Z decompose_k_mm_7_split_23 0.0165 ms 100.0% k_split=7 2025-09-07T09:48:13.3804777Z mm 0.0172 ms 96.3% 2025-09-07T09:48:13.3805700Z decompose_k_mm_98_split_19 0.0188 ms 87.8% k_split=98 2025-09-07T09:48:13.3806253Z decompose_k_mm_4_split_22 0.0197 ms 84.1% k_split=4 2025-09-07T09:48:13.3806843Z decompose_k_mm_196_split_20 0.0228 ms 72.5% k_split=196 2025-09-07T09:48:13.3807428Z decompose_k_mm_28_split_27 0.0254 ms 65.0% k_split=28 2025-09-07T09:48:13.3808669Z decompose_k_mm_8_split_24 0.0257 ms 64.4% k_split=8 2025-09-07T09:48:13.3809202Z decompose_k_mm_16_split_26 0.0259 ms 63.8% k_split=16 2025-09-07T09:48:13.3809718Z decompose_k_mm_49_split_18 0.0260 ms 63.7% k_split=49 2025-09-07T09:48:13.3810372Z decompose_k_mm_14_split_25 0.0260 ms 63.6% k_split=14 2025-09-07T09:48:13.3811126Z SingleProcess AUTOTUNE benchmarking takes 3.3357 seconds and 0.0002 seconds precompiling for 30 choices 2025-09-07T09:48:15.2640888Z Autotune Choices Stats: 2025-09-07T09:48:15.2643755Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.0163199994713068, "best_triton_pos": 1, "best_triton_time": 0.016607999801635742, "best_triton_kernel": "triton_mm_2656", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:48:15.6837875Z AUTOTUNE mm(392x4096, 4096x1024) 2025-09-07T09:48:15.6838176Z strides: [4096, 1], [1024, 1] 2025-09-07T09:48:15.6838436Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:15.6838703Z mm 0.0163 ms 100.0% 2025-09-07T09:48:15.6839308Z triton_mm_2656 0.0166 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:15.6840316Z triton_mm_2660 0.0205 ms 79.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:15.6841492Z triton_mm_2652 0.0256 ms 63.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:15.6842554Z triton_mm_2666 0.0288 ms 56.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:15.6843573Z triton_mm_2659 0.0320 ms 50.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:15.6844557Z triton_mm_2655 0.0325 ms 50.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:15.6845890Z triton_mm_2665 0.0356 ms 45.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:15.6846898Z triton_mm_2651 0.0367 ms 44.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:15.6847893Z triton_mm_2658 0.0378 ms 43.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:15.6848770Z SingleProcess AUTOTUNE benchmarking takes 0.6902 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:16.2174779Z Autotune Choices Stats: 2025-09-07T09:48:16.2176722Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_2694", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.00940799992531538, "best_triton_pos": 0} 2025-09-07T09:48:16.3614071Z AUTOTUNE mm(392x1024, 1024x1024) 2025-09-07T09:48:16.3614361Z strides: [1024, 1], [1024, 1] 2025-09-07T09:48:16.3614635Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:16.3615644Z triton_mm_2694 0.0094 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:16.3616286Z mm 0.0096 ms 97.7% 2025-09-07T09:48:16.3626154Z triton_mm_2698 0.0105 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:16.3627072Z triton_mm_2693 0.0122 ms 77.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:16.3628174Z triton_mm_2690 0.0123 ms 76.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:16.3629085Z triton_mm_2697 0.0126 ms 74.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:16.3630004Z triton_mm_2704 0.0126 ms 74.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:16.3630938Z triton_mm_2696 0.0136 ms 69.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:16.3631914Z triton_mm_2700 0.0139 ms 67.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:16.3632887Z triton_mm_2689 0.0140 ms 67.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:16.3633723Z SingleProcess AUTOTUNE benchmarking takes 0.6761 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:16.9577864Z Autotune Choices Stats: 2025-09-07T09:48:16.9578920Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_2731", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.0074880002066493034, "best_triton_pos": 0} 2025-09-07T09:48:17.1816021Z AUTOTUNE bmm(256x49x49, 256x49x32) 2025-09-07T09:48:17.1816368Z strides: [2432, 1, 49], [1568, 32, 1] 2025-09-07T09:48:17.1816649Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:17.1817326Z triton_bmm_2731 0.0075 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:17.1818339Z triton_bmm_2737 0.0075 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:17.1819338Z triton_bmm_2725 0.0077 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:17.1820351Z triton_bmm_2726 0.0078 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:17.1821359Z triton_bmm_2733 0.0079 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:17.1822324Z triton_bmm_2728 0.0080 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:17.1823168Z triton_bmm_2734 0.0080 ms 93.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:17.1824015Z triton_bmm_2736 0.0080 ms 93.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:48:17.1825598Z triton_bmm_2735 0.0080 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:17.1826497Z triton_bmm_2727 0.0081 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:17.1827242Z SingleProcess AUTOTUNE benchmarking takes 0.8191 seconds and 0.0004 seconds precompiling for 15 choices 2025-09-07T09:48:17.3196350Z Autotune Choices Stats: 2025-09-07T09:48:17.3197350Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_2746", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.007872000336647034, "best_triton_pos": 0} 2025-09-07T09:48:17.4163283Z AUTOTUNE bmm(256x49x32, 256x32x49) 2025-09-07T09:48:17.4163596Z strides: [1568, 32, 1], [1600, 1, 32] 2025-09-07T09:48:17.4163879Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:17.4164546Z triton_bmm_2746 0.0079 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:17.4165774Z triton_bmm_2744 0.0079 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:17.4166753Z triton_bmm_2749 0.0079 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:17.4167703Z triton_bmm_2742 0.0079 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:17.4168665Z triton_bmm_2747 0.0079 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:17.4169621Z triton_bmm_2740 0.0080 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:17.4170582Z triton_bmm_2748 0.0080 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:17.4171641Z triton_bmm_2750 0.0080 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:48:17.4172532Z triton_bmm_2751 0.0080 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:17.4173431Z triton_bmm_2739 0.0080 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:17.4174208Z SingleProcess AUTOTUNE benchmarking takes 0.2341 seconds and 0.0003 seconds precompiling for 15 choices 2025-09-07T09:48:17.5627025Z Autotune Choices Stats: 2025-09-07T09:48:17.5628043Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_bmm_2760", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.007840000092983246, "best_triton_pos": 0} 2025-09-07T09:48:17.6489650Z AUTOTUNE bmm(256x32x49, 256x49x49) 2025-09-07T09:48:17.6490158Z strides: [1600, 1, 32], [2401, 49, 1] 2025-09-07T09:48:17.6490609Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:17.6491852Z triton_bmm_2760 0.0078 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:17.6493534Z triton_bmm_2764 0.0078 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:17.6494444Z triton_bmm_2763 0.0079 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:17.6495925Z triton_bmm_2759 0.0079 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:17.6496869Z triton_bmm_2766 0.0079 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:17.6497766Z triton_bmm_2753 0.0080 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:17.6498674Z triton_bmm_2756 0.0080 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:17.6499616Z triton_bmm_2755 0.0083 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:17.6500537Z triton_bmm_2754 0.0085 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:17.6501548Z triton_bmm_2761 0.0085 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:17.6502349Z SingleProcess AUTOTUNE benchmarking takes 0.2318 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T09:48:17.7872118Z Autotune Choices Stats: 2025-09-07T09:48:17.7873542Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_2774", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.00774399982765317, "best_triton_pos": 0} 2025-09-07T09:48:17.8820168Z AUTOTUNE bmm(256x49x49, 256x49x32) 2025-09-07T09:48:17.8820714Z strides: [2401, 49, 1], [1600, 1, 49] 2025-09-07T09:48:17.8821175Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:17.8822275Z triton_bmm_2774 0.0077 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:17.8823308Z triton_bmm_2780 0.0078 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:17.8824310Z triton_bmm_2768 0.0078 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:17.8825497Z triton_bmm_2770 0.0079 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:17.8826479Z triton_bmm_2776 0.0079 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:17.8827466Z triton_bmm_2779 0.0079 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:48:17.8828433Z triton_bmm_2773 0.0079 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:17.8829701Z triton_bmm_2775 0.0079 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:17.8830671Z triton_bmm_2778 0.0081 ms 95.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:17.8831829Z triton_bmm_2771 0.0081 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:17.8832854Z SingleProcess AUTOTUNE benchmarking takes 0.2321 seconds and 0.0003 seconds precompiling for 15 choices 2025-09-07T09:48:18.1289454Z Autotune Choices Stats: 2025-09-07T09:48:18.1290704Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.013887999579310417, "best_triton_pos": 1, "best_triton_time": 0.014208000153303146, "best_triton_kernel": "triton_mm_2789", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:48:18.3428093Z AUTOTUNE mm(392x3072, 3072x1024) 2025-09-07T09:48:18.3428361Z strides: [3072, 1], [1024, 1] 2025-09-07T09:48:18.3428586Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:18.3428819Z mm 0.0139 ms 100.0% 2025-09-07T09:48:18.3429364Z triton_mm_2789 0.0142 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:18.3430253Z triton_mm_2793 0.0172 ms 80.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:18.3431115Z triton_mm_2785 0.0210 ms 66.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:18.3431969Z triton_mm_2799 0.0235 ms 59.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:18.3432950Z triton_mm_2792 0.0251 ms 55.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:18.3433925Z triton_mm_2788 0.0257 ms 54.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:18.3434898Z triton_mm_2798 0.0281 ms 49.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:18.3436178Z triton_mm_2784 0.0294 ms 47.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:18.3437150Z triton_mm_2791 0.0298 ms 46.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:18.5623563Z SingleProcess AUTOTUNE benchmarking takes 0.4600 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:48:18.5624105Z Autotune Choices Stats: 2025-09-07T09:48:18.5625627Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010623999871313572, "best_triton_pos": 1, "best_triton_time": 0.011648000217974186, "best_triton_kernel": "triton_mm_3059", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:48:18.5742576Z AUTOTUNE mm(392x1024, 1024x2048) 2025-09-07T09:48:18.5742856Z strides: [1024, 1], [2048, 1] 2025-09-07T09:48:18.5743151Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:18.5743427Z mm 0.0106 ms 100.0% 2025-09-07T09:48:18.5744041Z triton_mm_3059 0.0116 ms 91.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:18.5745463Z triton_mm_3065 0.0132 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:18.5746449Z triton_mm_3054 0.0132 ms 80.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:18.5747415Z triton_mm_3058 0.0136 ms 78.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:18.5748395Z triton_mm_3055 0.0140 ms 76.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:18.5749364Z triton_mm_3057 0.0147 ms 72.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:18.5750335Z triton_mm_3061 0.0148 ms 71.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:18.5751332Z triton_mm_3064 0.0150 ms 70.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:18.5752722Z triton_mm_3056 0.0176 ms 60.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:18.5753671Z SingleProcess AUTOTUNE benchmarking takes 0.2179 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:18.8050337Z Autotune Choices Stats: 2025-09-07T09:48:18.8051620Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.013663999736309052, "best_triton_pos": 1, "best_triton_time": 0.014655999839305878, "best_triton_kernel": "triton_mm_3116", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:48:19.0360180Z AUTOTUNE mm(1568x2048, 2048x512) 2025-09-07T09:48:19.0360446Z strides: [2048, 1], [512, 1] 2025-09-07T09:48:19.0360697Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:19.0360952Z mm 0.0137 ms 100.0% 2025-09-07T09:48:19.0361558Z triton_mm_3116 0.0147 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:19.0362806Z triton_mm_3112 0.0180 ms 75.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:19.0363830Z triton_mm_3122 0.0188 ms 72.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:19.0364814Z triton_mm_3115 0.0202 ms 67.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:19.0366126Z triton_mm_3111 0.0203 ms 67.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:19.0367376Z triton_mm_3121 0.0219 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:19.0368368Z triton_mm_3114 0.0233 ms 58.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:19.0369582Z triton_mm_3118 0.0239 ms 57.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:19.0370559Z triton_mm_3108 0.0270 ms 50.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:19.0371413Z SingleProcess AUTOTUNE benchmarking takes 0.4597 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:19.2288140Z Autotune Choices Stats: 2025-09-07T09:48:19.2289362Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.008767999708652496, "best_triton_pos": 1, "best_triton_time": 0.009344000369310379, "best_triton_kernel": "triton_mm_3154", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:48:19.2438198Z AUTOTUNE mm(1568x512, 512x512) 2025-09-07T09:48:19.2438451Z strides: [512, 1], [512, 1] 2025-09-07T09:48:19.2438696Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:19.2438972Z mm 0.0088 ms 100.0% 2025-09-07T09:48:19.2439566Z triton_mm_3154 0.0093 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:19.2440556Z triton_mm_3153 0.0098 ms 89.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:19.2441521Z triton_mm_3149 0.0103 ms 85.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:19.2442467Z triton_mm_3160 0.0104 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:19.2443377Z triton_mm_3152 0.0104 ms 84.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:19.2444273Z triton_mm_3156 0.0107 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:19.2445405Z triton_mm_3159 0.0108 ms 81.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:19.2446326Z triton_mm_3150 0.0113 ms 77.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:19.2447225Z triton_mm_3151 0.0118 ms 74.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:19.2448013Z SingleProcess AUTOTUNE benchmarking takes 0.2066 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:19.3487325Z Autotune Choices Stats: 2025-09-07T09:48:19.3488306Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_3184", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008608000352978706, "best_triton_pos": 0} 2025-09-07T09:48:19.4637743Z AUTOTUNE bmm(512x49x49, 512x49x32) 2025-09-07T09:48:19.4638021Z strides: [2432, 1, 49], [1568, 32, 1] 2025-09-07T09:48:19.4638284Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:19.4638930Z triton_bmm_3184 0.0086 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:19.4640147Z triton_bmm_3187 0.0086 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:19.4641129Z triton_bmm_3190 0.0086 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:19.4642117Z triton_bmm_3193 0.0086 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:19.4643084Z triton_bmm_3191 0.0087 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:19.4643981Z triton_bmm_3192 0.0089 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:48:19.4644883Z triton_bmm_3181 0.0090 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:19.4645928Z triton_bmm_3189 0.0091 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:19.4646824Z triton_bmm_3183 0.0092 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:19.4647731Z triton_bmm_3182 0.0092 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:19.4648514Z SingleProcess AUTOTUNE benchmarking takes 0.2187 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:48:19.6036512Z Autotune Choices Stats: 2025-09-07T09:48:19.6037514Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_3202", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008352000266313553, "best_triton_pos": 0} 2025-09-07T09:48:19.6943414Z AUTOTUNE bmm(512x56x32, 512x32x56) 2025-09-07T09:48:19.6943709Z strides: [1792, 32, 1], [1792, 1, 32] 2025-09-07T09:48:19.6944007Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:19.6944697Z triton_bmm_3202 0.0084 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:19.6945920Z triton_bmm_3207 0.0084 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:19.6946917Z triton_bmm_3200 0.0084 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:19.6947895Z triton_bmm_3201 0.0084 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:19.6948875Z triton_bmm_3203 0.0085 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:19.6950165Z triton_bmm_3205 0.0085 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:19.6951131Z triton_bmm_3206 0.0085 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:48:19.6952265Z triton_bmm_3198 0.0086 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:19.6953388Z triton_bmm_3196 0.0086 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:19.6954350Z triton_bmm_3204 0.0087 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:19.6955331Z SingleProcess AUTOTUNE benchmarking takes 0.2298 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:48:19.8055394Z Autotune Choices Stats: 2025-09-07T09:48:19.8056369Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_bmm_3219", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.009279999881982803, "best_triton_pos": 0} 2025-09-07T09:48:19.9258524Z AUTOTUNE bmm(512x32x49, 512x49x49) 2025-09-07T09:48:19.9258977Z strides: [1600, 1, 32], [2401, 49, 1] 2025-09-07T09:48:19.9259422Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:19.9260459Z triton_bmm_3219 0.0093 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:19.9262291Z triton_bmm_3216 0.0093 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:19.9263488Z triton_bmm_3220 0.0094 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:19.9264343Z triton_bmm_3215 0.0095 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:19.9265447Z triton_bmm_3217 0.0095 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:19.9266288Z triton_bmm_3212 0.0096 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:19.9267133Z triton_bmm_3222 0.0097 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:19.9267963Z triton_bmm_3209 0.0097 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:19.9268804Z triton_bmm_3211 0.0098 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:19.9269642Z triton_bmm_3210 0.0100 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:19.9270373Z SingleProcess AUTOTUNE benchmarking takes 0.2307 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T09:48:20.0304427Z Autotune Choices Stats: 2025-09-07T09:48:20.0305640Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_3226", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.009247999638319016, "best_triton_pos": 0} 2025-09-07T09:48:20.1490428Z AUTOTUNE bmm(512x49x49, 512x49x32) 2025-09-07T09:48:20.1490914Z strides: [2401, 49, 1], [1600, 1, 49] 2025-09-07T09:48:20.1491357Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:20.1492688Z triton_bmm_3226 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:20.1493701Z triton_bmm_3230 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:20.1494700Z triton_bmm_3229 0.0093 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:20.1495970Z triton_bmm_3236 0.0093 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:20.1496955Z triton_bmm_3227 0.0093 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:20.1497916Z triton_bmm_3231 0.0093 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:20.1498877Z triton_bmm_3234 0.0094 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:20.1499841Z triton_bmm_3232 0.0094 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:20.1500801Z triton_bmm_3233 0.0094 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:20.1501848Z triton_bmm_3235 0.0095 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:48:20.1502662Z SingleProcess AUTOTUNE benchmarking takes 0.2223 seconds and 0.0003 seconds precompiling for 15 choices 2025-09-07T09:48:20.3666675Z Autotune Choices Stats: 2025-09-07T09:48:20.3667904Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01196799986064434, "best_triton_pos": 1, "best_triton_time": 0.012927999719977379, "best_triton_kernel": "triton_mm_3249", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:48:20.3773597Z AUTOTUNE mm(1568x1536, 1536x512) 2025-09-07T09:48:20.3773860Z strides: [1536, 1], [512, 1] 2025-09-07T09:48:20.3774108Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:20.3774370Z mm 0.0120 ms 100.0% 2025-09-07T09:48:20.3775269Z triton_mm_3249 0.0129 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:20.3776269Z triton_mm_3255 0.0150 ms 79.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:20.3777253Z triton_mm_3245 0.0163 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:20.3778536Z triton_mm_3244 0.0164 ms 73.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:20.3779486Z triton_mm_3248 0.0169 ms 71.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:20.3780661Z triton_mm_3254 0.0182 ms 65.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:20.3781763Z triton_mm_3247 0.0184 ms 64.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:20.3782748Z triton_mm_3251 0.0188 ms 63.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:20.3783581Z triton_mm_3241 0.0225 ms 53.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:20.3784305Z SingleProcess AUTOTUNE benchmarking takes 0.2276 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:20.8112481Z Autotune Choices Stats: 2025-09-07T09:48:20.8113567Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_6865", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.010463999584317207, "best_triton_pos": 0} 2025-09-07T09:48:20.8380043Z AUTOTUNE mm(1568x512, 512x1024) 2025-09-07T09:48:20.8380474Z strides: [512, 1], [1024, 1] 2025-09-07T09:48:20.8380902Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:20.8382104Z triton_mm_6865 0.0105 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:20.8383040Z mm 0.0105 ms 99.7% 2025-09-07T09:48:20.8383622Z triton_mm_6864 0.0108 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:20.8384659Z triton_mm_6858 0.0109 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:20.8385776Z triton_mm_6857 0.0115 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:20.8386745Z triton_mm_6854 0.0116 ms 90.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:20.8387714Z triton_mm_6861 0.0116 ms 89.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:20.8388677Z triton_mm_6856 0.0123 ms 84.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:20.8389639Z triton_mm_6863 0.0126 ms 83.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:20.8390611Z triton_mm_6860 0.0128 ms 81.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:20.8391455Z SingleProcess AUTOTUNE benchmarking takes 0.2228 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:21.0527324Z Autotune Choices Stats: 2025-09-07T09:48:21.0528566Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.013728000223636627, "best_triton_pos": 1, "best_triton_time": 0.015231999568641186, "best_triton_kernel": "triton_mm_6922", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T09:48:21.0600698Z AUTOTUNE mm(6272x1024, 1024x256) 2025-09-07T09:48:21.0601216Z strides: [1024, 1], [256, 1] 2025-09-07T09:48:21.0601481Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:21.0601733Z mm 0.0137 ms 100.0% 2025-09-07T09:48:21.0602408Z triton_mm_6922 0.0152 ms 90.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:21.0603599Z triton_mm_6915 0.0159 ms 86.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.0604565Z triton_mm_6911 0.0169 ms 81.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:21.0605838Z triton_mm_6921 0.0172 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.0606813Z triton_mm_6914 0.0181 ms 75.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:21.0607769Z triton_mm_6916 0.0182 ms 75.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:21.0608744Z triton_mm_6918 0.0191 ms 71.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:21.0609693Z triton_mm_6912 0.0197 ms 69.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:21.0610650Z triton_mm_6913 0.0209 ms 65.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.0611494Z SingleProcess AUTOTUNE benchmarking takes 0.2202 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:21.2500695Z Autotune Choices Stats: 2025-09-07T09:48:21.2501952Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.009631999768316746, "best_triton_pos": 1, "best_triton_time": 0.00979200005531311, "best_triton_kernel": "triton_mm_6949", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T09:48:21.2929699Z AUTOTUNE mm(6272x256, 256x256) 2025-09-07T09:48:21.2930121Z strides: [256, 1], [256, 1] 2025-09-07T09:48:21.2930549Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:21.2930972Z mm 0.0096 ms 100.0% 2025-09-07T09:48:21.2931900Z triton_mm_6949 0.0098 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:21.2933443Z triton_mm_6953 0.0098 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.2934426Z triton_mm_6960 0.0100 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:21.2936065Z triton_mm_6956 0.0101 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:21.2937043Z triton_mm_6959 0.0101 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.2938199Z triton_mm_6952 0.0102 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:21.2939167Z triton_mm_6951 0.0104 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.2940151Z triton_mm_6955 0.0106 ms 91.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.2941120Z triton_mm_6958 0.0107 ms 89.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.2942059Z SingleProcess AUTOTUNE benchmarking takes 0.2310 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:21.4021171Z Autotune Choices Stats: 2025-09-07T09:48:21.4022219Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_6991", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.01065600011497736, "best_triton_pos": 0} 2025-09-07T09:48:21.4628557Z AUTOTUNE bmm(1024x49x49, 1024x49x32) 2025-09-07T09:48:21.4629043Z strides: [2432, 1, 49], [1568, 32, 1] 2025-09-07T09:48:21.4629501Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:21.4630606Z triton_bmm_6991 0.0107 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:21.4632230Z triton_bmm_6984 0.0108 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:21.4634035Z triton_bmm_6990 0.0109 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.4635355Z triton_bmm_6986 0.0113 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:21.4636342Z triton_bmm_6981 0.0116 ms 92.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:21.4637324Z triton_bmm_6987 0.0116 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:21.4638299Z triton_bmm_6993 0.0116 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:21.4639274Z triton_bmm_6985 0.0116 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:21.4640251Z triton_bmm_6988 0.0117 ms 90.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.4641229Z triton_bmm_6992 0.0121 ms 87.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:48:21.4642321Z SingleProcess AUTOTUNE benchmarking takes 0.1676 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:48:21.5719930Z Autotune Choices Stats: 2025-09-07T09:48:21.5721052Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_7002", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.012256000190973282, "best_triton_pos": 0} 2025-09-07T09:48:21.6080510Z AUTOTUNE bmm(1024x49x32, 1024x32x49) 2025-09-07T09:48:21.6080782Z strides: [1568, 32, 1], [1600, 1, 32] 2025-09-07T09:48:21.6081032Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:21.6081625Z triton_bmm_7002 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:21.6082619Z triton_bmm_7000 0.0123 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:21.6083706Z triton_bmm_7003 0.0123 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.6084658Z triton_bmm_7005 0.0124 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:21.6085948Z triton_bmm_6998 0.0129 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:21.6086909Z triton_bmm_7006 0.0131 ms 93.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:48:21.6087875Z triton_bmm_7001 0.0132 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:21.6088856Z triton_bmm_7007 0.0132 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:21.6089816Z triton_bmm_7004 0.0132 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:21.6090782Z triton_bmm_6996 0.0132 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:21.6091620Z SingleProcess AUTOTUNE benchmarking takes 0.1446 seconds and 0.0003 seconds precompiling for 15 choices 2025-09-07T09:48:21.7245276Z Autotune Choices Stats: 2025-09-07T09:48:21.7246279Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_bmm_7019", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.011680000461637974, "best_triton_pos": 0} 2025-09-07T09:48:21.8148726Z AUTOTUNE bmm(1024x32x49, 1024x49x49) 2025-09-07T09:48:21.8149012Z strides: [1600, 1, 32], [2401, 49, 1] 2025-09-07T09:48:21.8149292Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:21.8149940Z triton_bmm_7019 0.0117 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:21.8150935Z triton_bmm_7020 0.0118 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:22.0129099Z triton_bmm_7016 0.0120 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:22.0130706Z triton_bmm_7022 0.0121 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:22.0162210Z triton_bmm_7015 0.0124 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:22.0163752Z triton_bmm_7012 0.0133 ms 87.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:22.0165580Z triton_bmm_7009 0.0134 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:22.0167258Z triton_bmm_7013 0.0136 ms 86.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:22.0168836Z triton_bmm_7011 0.0136 ms 85.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:22.0170392Z triton_bmm_7017 0.0139 ms 83.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.0171747Z SingleProcess AUTOTUNE benchmarking takes 0.2063 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T09:48:22.0172544Z Autotune Choices Stats: 2025-09-07T09:48:22.0173561Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_7034", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.01152000017464161, "best_triton_pos": 0} 2025-09-07T09:48:22.0188698Z AUTOTUNE bmm(1024x49x49, 1024x49x32) 2025-09-07T09:48:22.0188944Z strides: [2401, 49, 1], [1600, 1, 49] 2025-09-07T09:48:22.0189192Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:22.0189787Z triton_bmm_7034 0.0115 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:22.0190721Z triton_bmm_7027 0.0116 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:22.0191636Z triton_bmm_7033 0.0116 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.0192553Z triton_bmm_7036 0.0118 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:22.0193482Z triton_bmm_7030 0.0119 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:22.0194393Z triton_bmm_7031 0.0123 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.0195434Z triton_bmm_7029 0.0124 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:22.0196340Z triton_bmm_7032 0.0126 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:22.0197412Z triton_bmm_7024 0.0126 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:22.0198315Z triton_bmm_7026 0.0128 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:22.0199112Z SingleProcess AUTOTUNE benchmarking takes 0.2036 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:48:22.2246039Z Autotune Choices Stats: 2025-09-07T09:48:22.2247288Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01228800043463707, "best_triton_pos": 1, "best_triton_time": 0.013248000293970108, "best_triton_kernel": "triton_mm_7055", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T09:48:22.2577352Z AUTOTUNE mm(6272x768, 768x256) 2025-09-07T09:48:22.2577626Z strides: [768, 1], [256, 1] 2025-09-07T09:48:22.2577888Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:22.2578165Z mm 0.0123 ms 100.0% 2025-09-07T09:48:22.2578782Z triton_mm_7055 0.0132 ms 92.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:22.2579810Z triton_mm_7048 0.0139 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.2580782Z triton_mm_7044 0.0144 ms 85.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:22.2581850Z triton_mm_7054 0.0145 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.2582819Z triton_mm_7047 0.0152 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:22.2583642Z triton_mm_7051 0.0159 ms 77.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:22.2584490Z triton_mm_7049 0.0163 ms 75.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:22.2585482Z triton_mm_7046 0.0172 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.2586309Z triton_mm_7045 0.0176 ms 69.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:22.2587042Z SingleProcess AUTOTUNE benchmarking takes 0.2382 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:22.4706072Z Autotune Choices Stats: 2025-09-07T09:48:22.4707050Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_7314", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.010879999957978725, "best_triton_pos": 0} 2025-09-07T09:48:22.4853194Z AUTOTUNE mm(6272x256, 256x512) 2025-09-07T09:48:22.4853458Z strides: [256, 1], [512, 1] 2025-09-07T09:48:22.4853716Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:22.4854355Z triton_mm_7314 0.0109 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.4855580Z mm 0.0112 ms 96.9% 2025-09-07T09:48:22.4856157Z triton_mm_7320 0.0113 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.4857142Z triton_mm_7313 0.0114 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:22.4858277Z triton_mm_7312 0.0115 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.4859282Z triton_mm_7319 0.0116 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.4860251Z triton_mm_7317 0.0117 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:22.4861230Z triton_mm_7316 0.0117 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.4862272Z triton_mm_7309 0.0129 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:22.4863242Z triton_mm_7321 0.0135 ms 80.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:22.4863972Z SingleProcess AUTOTUNE benchmarking takes 0.2098 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:22.7129577Z Autotune Choices Stats: 2025-09-07T09:48:22.7130572Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_7371", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.019360000267624855, "best_triton_pos": 0} 2025-09-07T09:48:22.7209377Z AUTOTUNE mm(25088x512, 512x128) 2025-09-07T09:48:22.7209648Z strides: [512, 1], [128, 1] 2025-09-07T09:48:22.7209907Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:22.7210573Z triton_mm_7371 0.0194 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.7211563Z triton_mm_7377 0.0203 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.7212191Z mm 0.0203 ms 95.3% 2025-09-07T09:48:22.7212792Z triton_mm_7369 0.0221 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.7213850Z triton_mm_7378 0.0224 ms 86.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:22.7214750Z triton_mm_7370 0.0227 ms 85.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:22.7215887Z triton_mm_7376 0.0228 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.7216739Z triton_mm_7367 0.0231 ms 83.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:22.7217573Z triton_mm_7372 0.0231 ms 83.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:22.7218673Z triton_mm_7374 0.0240 ms 80.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:22.7219408Z SingleProcess AUTOTUNE benchmarking takes 0.2252 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:22.9214168Z Autotune Choices Stats: 2025-09-07T09:48:22.9215794Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_7409", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.010432000271975994, "best_triton_pos": 0} 2025-09-07T09:48:22.9554034Z AUTOTUNE mm(25088x128, 128x128) 2025-09-07T09:48:22.9554265Z strides: [128, 1], [128, 1] 2025-09-07T09:48:22.9554493Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:22.9555273Z triton_mm_7409 0.0104 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.9556201Z triton_mm_7407 0.0105 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.9557118Z triton_mm_7411 0.0110 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.9558028Z triton_mm_7408 0.0112 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:22.9558944Z triton_mm_7414 0.0112 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.9559852Z triton_mm_7404 0.0113 ms 92.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:22.9560426Z mm 0.0114 ms 91.8% 2025-09-07T09:48:22.9560953Z triton_mm_7412 0.0114 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:22.9561857Z triton_mm_7415 0.0116 ms 90.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:22.9562767Z triton_mm_7410 0.0118 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:22.9563584Z SingleProcess AUTOTUNE benchmarking takes 0.2249 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:23.0772969Z Autotune Choices Stats: 2025-09-07T09:48:23.0773939Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_7440", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.015968000516295433, "best_triton_pos": 0} 2025-09-07T09:48:23.1904701Z AUTOTUNE bmm(2048x49x49, 2048x49x32) 2025-09-07T09:48:23.1905326Z strides: [2432, 1, 49], [1568, 32, 1] 2025-09-07T09:48:23.1905631Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:23.1906257Z triton_bmm_7440 0.0160 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:23.1907210Z triton_bmm_7447 0.0160 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:23.1908452Z triton_bmm_7446 0.0160 ms 99.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:23.1909392Z triton_bmm_7443 0.0168 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:23.1910477Z triton_bmm_7449 0.0169 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:23.1911404Z triton_bmm_7437 0.0170 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:23.1912316Z triton_bmm_7442 0.0172 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:23.1913230Z triton_bmm_7448 0.0178 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:48:23.1914142Z triton_bmm_7444 0.0179 ms 89.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:23.1915206Z triton_bmm_7441 0.0180 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:23.1916013Z SingleProcess AUTOTUNE benchmarking takes 0.2268 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:48:23.3085239Z Autotune Choices Stats: 2025-09-07T09:48:23.3086479Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_7458", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.017983999103307724, "best_triton_pos": 0} 2025-09-07T09:48:23.4253890Z AUTOTUNE bmm(2048x49x32, 2048x32x49) 2025-09-07T09:48:23.4254193Z strides: [1568, 32, 1], [1600, 1, 32] 2025-09-07T09:48:23.4254498Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:23.4255488Z triton_bmm_7458 0.0180 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:23.4256509Z triton_bmm_7461 0.0182 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:23.4257489Z triton_bmm_7456 0.0183 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:23.4258458Z triton_bmm_7459 0.0183 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:23.4259426Z triton_bmm_7460 0.0188 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:23.4260435Z triton_bmm_7457 0.0191 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:23.4261538Z triton_bmm_7463 0.0192 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:23.4262524Z triton_bmm_7454 0.0195 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:23.4263851Z triton_bmm_7462 0.0196 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:48:23.4264698Z triton_bmm_7452 0.0203 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:23.4265794Z SingleProcess AUTOTUNE benchmarking takes 0.2342 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:48:23.5512192Z Autotune Choices Stats: 2025-09-07T09:48:23.5513202Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_bmm_7472", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.016863999888300896, "best_triton_pos": 0} 2025-09-07T09:48:23.6557151Z AUTOTUNE bmm(2048x32x49, 2048x49x49) 2025-09-07T09:48:23.6557433Z strides: [1600, 1, 32], [2401, 49, 1] 2025-09-07T09:48:23.6557702Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:23.6558358Z triton_bmm_7472 0.0169 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:23.6559389Z triton_bmm_7475 0.0170 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:23.6560381Z triton_bmm_7476 0.0170 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:23.6561362Z triton_bmm_7471 0.0177 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:23.6562347Z triton_bmm_7478 0.0177 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:23.6563318Z triton_bmm_7465 0.0199 ms 84.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:23.6564274Z triton_bmm_7468 0.0202 ms 83.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:23.6565401Z triton_bmm_7467 0.0209 ms 80.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:23.6566349Z triton_bmm_7469 0.0212 ms 79.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:23.6567277Z triton_bmm_7473 0.0212 ms 79.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:23.6568073Z SingleProcess AUTOTUNE benchmarking takes 0.2295 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T09:48:23.8058183Z Autotune Choices Stats: 2025-09-07T09:48:23.8059191Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_7483", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.016607999801635742, "best_triton_pos": 0} 2025-09-07T09:48:23.8874615Z AUTOTUNE bmm(2048x56x56, 2048x56x32) 2025-09-07T09:48:23.8875371Z strides: [3136, 56, 1], [1792, 1, 56] 2025-09-07T09:48:23.8875648Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:23.8876549Z triton_bmm_7483 0.0166 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:23.8877546Z triton_bmm_7489 0.0167 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:23.8878698Z triton_bmm_7490 0.0167 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:48:23.8879738Z triton_bmm_7487 0.0172 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:23.8880729Z triton_bmm_7485 0.0176 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:48:23.8881714Z triton_bmm_7486 0.0178 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:23.8882710Z triton_bmm_7492 0.0178 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:23.8883713Z triton_bmm_7488 0.0179 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:23.8884648Z triton_bmm_7481 0.0180 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:48:23.8885698Z triton_bmm_7482 0.0180 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:23.8886515Z SingleProcess AUTOTUNE benchmarking takes 0.2312 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:48:24.0970197Z Autotune Choices Stats: 2025-09-07T09:48:24.0971207Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_7504", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.01603199914097786, "best_triton_pos": 0} 2025-09-07T09:48:24.1226712Z AUTOTUNE mm(25088x384, 384x128) 2025-09-07T09:48:24.1226964Z strides: [384, 1], [128, 1] 2025-09-07T09:48:24.1227224Z dtypes: torch.float16, torch.float16 2025-09-07T09:48:24.1227887Z triton_mm_7504 0.0160 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:24.1228890Z triton_mm_7510 0.0170 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:24.1229505Z mm 0.0170 ms 94.2% 2025-09-07T09:48:24.1230084Z triton_mm_7502 0.0177 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:24.1231044Z triton_mm_7503 0.0179 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:24.1232016Z triton_mm_7509 0.0185 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:24.1232991Z triton_mm_7511 0.0191 ms 83.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:48:24.1234375Z triton_mm_7507 0.0193 ms 83.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:48:24.1235684Z triton_mm_7506 0.0194 ms 82.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:48:24.1236876Z triton_mm_7500 0.0198 ms 81.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:48:24.1237728Z SingleProcess AUTOTUNE benchmarking takes 0.2345 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:48:39.8860349Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/timm_models.py", line 442, in torch_dynamo_resume_in_forward_and_backward_pass_at_440 2025-09-07T09:48:39.8861551Z pred = mod(*cloned_inputs) 2025-09-07T09:48:39.8862123Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 838, in forward 2025-09-07T09:48:39.8862696Z x = self.forward_features(x) 2025-09-07T09:48:39.8863256Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 830, in forward_features 2025-09-07T09:48:39.8863809Z x = self.layers(x) 2025-09-07T09:48:39.8864297Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 559, in forward 2025-09-07T09:48:39.8864813Z x = self.blocks(x) 2025-09-07T09:48:39.8865687Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 406, in forward 2025-09-07T09:48:39.8866258Z x = x + self.drop_path1(self._attn(self.norm1(x))) 2025-09-07T09:48:39.8866817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 390, in _attn 2025-09-07T09:48:39.8867533Z attn_windows = self.attn(x_windows, mask=attn_mask) # nW*B, window_size*window_size, C 2025-09-07T09:48:39.8868369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 194, in forward 2025-09-07T09:48:39.8868899Z attn = attn + self._get_rel_pos_bias() 2025-09-07T09:48:39.8869456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 165, in _get_rel_pos_bias 2025-09-07T09:48:39.8870096Z relative_position_bias = self.relative_position_bias_table[ 2025-09-07T09:48:39.8870355Z 2025-09-07T09:48:39.8870359Z 2025-09-07T09:48:43.1756702Z W0907 09:48:43.174000 65018 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:49:33.0463974Z pass 2025-09-07T09:49:41.7690766Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:49:41.7692475Z import pynvml # type: ignore[import] 2025-09-07T09:49:44.9723599Z 2025-09-07T09:49:48.0217645Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:49:48.0218010Z loading model: 0it [00:03, ?it/s] 2025-09-07T09:49:48.0218311Z cuda train swsl_resnext101_32x16d 2025-09-07T09:50:18.9559064Z Autotune Choices Stats: 2025-09-07T09:50:18.9560882Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_5", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8", "best_time": 0.05641600117087364, "best_triton_pos": 0} 2025-09-07T09:50:19.0494047Z AUTOTUNE convolution(8x3x224x224, 64x3x7x7) 2025-09-07T09:50:19.0494664Z strides: [150528, 50176, 224, 1], [147, 49, 7, 1] 2025-09-07T09:50:19.0495890Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:19.0497199Z triton_convolution2d_5 0.0564 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:50:19.0512126Z triton_convolution2d_1 0.0614 ms 91.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:50:19.0513394Z triton_convolution2d_3 0.0630 ms 89.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:50:19.0514153Z convolution 0.0639 ms 88.3% 2025-09-07T09:50:19.0514880Z triton_convolution2d_0 0.0726 ms 77.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:50:19.0516303Z triton_convolution2d_4 0.0868 ms 65.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:50:19.0517447Z triton_convolution2d_2 0.1816 ms 31.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:50:19.0518348Z SingleProcess AUTOTUNE benchmarking takes 0.2451 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T09:50:19.1831827Z Autotune Choices Stats: 2025-09-07T09:50:19.1833191Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01600000075995922, "best_triton_pos": 1, "best_triton_time": 0.01942400075495243, "best_triton_kernel": "triton_convolution2d_10", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:50:19.2619220Z AUTOTUNE convolution(8x64x56x56, 512x64x1x1) 2025-09-07T09:50:19.2619767Z strides: [200704, 3136, 56, 1], [64, 1, 1, 1] 2025-09-07T09:50:19.2620249Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:19.2620700Z convolution 0.0160 ms 100.0% 2025-09-07T09:50:19.2622018Z triton_convolution2d_10 0.0194 ms 82.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.2624015Z triton_convolution2d_9 0.0205 ms 78.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.2626426Z triton_convolution2d_7 0.0212 ms 75.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.2627632Z triton_convolution2d_6 0.0212 ms 75.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.2628841Z triton_convolution2d_12 0.0222 ms 71.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.2630051Z triton_convolution2d_11 0.0224 ms 71.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.2631513Z triton_convolution2d_8 0.0311 ms 51.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:19.2632263Z conv1x1_via_mm 0.1360 ms 11.8% 2025-09-07T09:50:19.2632728Z SingleProcess AUTOTUNE benchmarking takes 0.2117 seconds and 0.0003 seconds precompiling for 9 choices 2025-09-07T09:50:19.4087400Z Autotune Choices Stats: 2025-09-07T09:50:19.4089921Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.024831999093294144, "best_triton_pos": 1, "best_triton_time": 0.038816001266241074, "best_triton_kernel": "triton_convolution2d_13", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:50:19.4912587Z AUTOTUNE convolution(8x512x56x56, 256x512x1x1) 2025-09-07T09:50:19.4912905Z strides: [1605632, 3136, 56, 1], [512, 1, 1, 1] 2025-09-07T09:50:19.4913199Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:19.4913462Z convolution 0.0248 ms 100.0% 2025-09-07T09:50:19.4914207Z triton_convolution2d_13 0.0388 ms 64.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.4915725Z triton_convolution2d_17 0.0423 ms 58.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.4916928Z triton_convolution2d_18 0.0434 ms 57.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.4918057Z triton_convolution2d_14 0.0441 ms 56.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.4919187Z triton_convolution2d_16 0.0449 ms 55.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.4920320Z triton_convolution2d_19 0.0496 ms 50.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.4921435Z triton_convolution2d_15 0.0876 ms 28.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:19.4922130Z conv1x1_via_mm 0.1731 ms 14.3% 2025-09-07T09:50:19.4922562Z SingleProcess AUTOTUNE benchmarking takes 0.2280 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:50:19.6216601Z Autotune Choices Stats: 2025-09-07T09:50:19.6218790Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.011872000060975552, "best_triton_pos": 1, "best_triton_time": 0.013183999806642532, "best_triton_kernel": "triton_convolution2d_20", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:50:19.7225638Z AUTOTUNE convolution(8x64x56x56, 256x64x1x1) 2025-09-07T09:50:19.7226167Z strides: [200704, 3136, 56, 1], [64, 1, 1, 1] 2025-09-07T09:50:19.7226711Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:19.7227354Z convolution 0.0119 ms 100.0% 2025-09-07T09:50:19.7228505Z triton_convolution2d_20 0.0132 ms 90.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.7229934Z triton_convolution2d_24 0.0137 ms 86.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.7231285Z triton_convolution2d_23 0.0144 ms 82.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.7232502Z triton_convolution2d_21 0.0146 ms 81.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.7233709Z triton_convolution2d_25 0.0156 ms 75.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.7234922Z triton_convolution2d_26 0.0157 ms 75.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.7236285Z triton_convolution2d_22 0.0205 ms 58.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:19.7237021Z conv1x1_via_mm 0.0805 ms 14.7% 2025-09-07T09:50:19.7237455Z SingleProcess AUTOTUNE benchmarking takes 0.2298 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:50:19.8680037Z Autotune Choices Stats: 2025-09-07T09:50:19.8682184Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.022463999688625336, "best_triton_pos": 1, "best_triton_time": 0.03750399872660637, "best_triton_kernel": "triton_convolution2d_30", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T09:50:19.9140412Z AUTOTUNE convolution(8x256x56x56, 512x256x1x1) 2025-09-07T09:50:19.9140910Z strides: [802816, 3136, 56, 1], [256, 1, 1, 1] 2025-09-07T09:50:19.9141492Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:19.9141957Z convolution 0.0225 ms 100.0% 2025-09-07T09:50:19.9143159Z triton_convolution2d_30 0.0375 ms 59.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.9145682Z triton_convolution2d_31 0.0379 ms 59.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.9147813Z triton_convolution2d_33 0.0410 ms 54.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.9149036Z triton_convolution2d_28 0.0424 ms 52.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.9150259Z triton_convolution2d_32 0.0507 ms 44.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:19.9151475Z triton_convolution2d_27 0.0551 ms 40.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:19.9152913Z triton_convolution2d_29 0.0826 ms 27.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:19.9153668Z conv1x1_via_mm 0.1763 ms 12.7% 2025-09-07T09:50:19.9154138Z SingleProcess AUTOTUNE benchmarking takes 0.1898 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:50:20.0959149Z Autotune Choices Stats: 2025-09-07T09:50:20.0961715Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03532800078392029, "best_triton_pos": 1, "best_triton_time": 0.0663359984755516, "best_triton_kernel": "triton_convolution2d_58", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T09:50:20.1041809Z AUTOTUNE convolution(8x256x56x56, 1024x256x1x1) 2025-09-07T09:50:20.1042376Z strides: [802816, 3136, 56, 1], [256, 1, 1, 1] 2025-09-07T09:50:20.1042848Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:20.1043286Z convolution 0.0353 ms 100.0% 2025-09-07T09:50:20.1044476Z triton_convolution2d_58 0.0663 ms 53.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.1046827Z triton_convolution2d_59 0.0674 ms 52.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.1047969Z triton_convolution2d_61 0.0719 ms 49.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.1049117Z triton_convolution2d_56 0.0744 ms 47.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.1050240Z triton_convolution2d_55 0.0884 ms 39.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.1051377Z triton_convolution2d_60 0.0885 ms 39.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.1052505Z triton_convolution2d_57 0.1438 ms 24.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:20.1053202Z conv1x1_via_mm 0.2997 ms 11.8% 2025-09-07T09:50:20.1053643Z SingleProcess AUTOTUNE benchmarking takes 0.1811 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:50:20.3166262Z Autotune Choices Stats: 2025-09-07T09:50:20.3168421Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01817600056529045, "best_triton_pos": 1, "best_triton_time": 0.03561599925160408, "best_triton_kernel": "triton_convolution2d_66", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:50:20.3339374Z AUTOTUNE convolution(8x1024x28x28, 512x1024x1x1) 2025-09-07T09:50:20.3339925Z strides: [802816, 784, 28, 1], [1024, 1, 1, 1] 2025-09-07T09:50:20.3340393Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:20.3340832Z convolution 0.0182 ms 100.0% 2025-09-07T09:50:20.3342124Z triton_convolution2d_66 0.0356 ms 51.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.3344481Z triton_convolution2d_65 0.0395 ms 46.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.3347079Z triton_convolution2d_67 0.0420 ms 43.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.3349006Z triton_convolution2d_68 0.0420 ms 43.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.3350224Z triton_convolution2d_62 0.0592 ms 30.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.3351451Z triton_convolution2d_63 0.0633 ms 28.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.3352674Z triton_convolution2d_64 0.0843 ms 21.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:20.3353419Z conv1x1_via_mm 0.1072 ms 17.0% 2025-09-07T09:50:20.3353885Z SingleProcess AUTOTUNE benchmarking takes 0.2285 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:50:20.4472242Z Autotune Choices Stats: 2025-09-07T09:50:20.4473330Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_74", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8", "best_time": 0.019328000023961067, "best_triton_pos": 0} 2025-09-07T09:50:20.4606546Z AUTOTUNE convolution(8x256x56x56, 512x256x1x1) 2025-09-07T09:50:20.4607069Z strides: [802816, 3136, 56, 1], [256, 1, 1, 1] 2025-09-07T09:50:20.4607538Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:20.4608767Z triton_convolution2d_74 0.0193 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.4610735Z triton_convolution2d_75 0.0249 ms 77.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.4612662Z triton_convolution2d_69 0.0251 ms 77.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.4614583Z triton_convolution2d_73 0.0254 ms 76.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.4616889Z triton_convolution2d_70 0.0263 ms 73.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.4617591Z convolution 0.0356 ms 54.2% 2025-09-07T09:50:20.4618224Z triton_convolution2d_72 0.0560 ms 34.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.4619470Z triton_convolution2d_71 0.0886 ms 21.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:20.4620304Z SingleProcess AUTOTUNE benchmarking takes 0.1253 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:50:20.6069863Z Autotune Choices Stats: 2025-09-07T09:50:20.6071407Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.020479999482631683, "best_triton_pos": 1, "best_triton_time": 0.03577600046992302, "best_triton_kernel": "triton_convolution2d_80", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:50:20.6953592Z AUTOTUNE convolution(8x512x28x28, 1024x512x1x1) 2025-09-07T09:50:20.6953912Z strides: [401408, 784, 28, 1], [512, 1, 1, 1] 2025-09-07T09:50:20.6954207Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:20.6954475Z convolution 0.0205 ms 100.0% 2025-09-07T09:50:20.6955484Z triton_convolution2d_80 0.0358 ms 57.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.6957340Z triton_convolution2d_76 0.0360 ms 56.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.6959302Z triton_convolution2d_79 0.0404 ms 50.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.6961286Z triton_convolution2d_81 0.0423 ms 48.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.6963256Z triton_convolution2d_82 0.0429 ms 47.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.6965466Z triton_convolution2d_77 0.0570 ms 36.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.6967174Z triton_convolution2d_78 0.0856 ms 23.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:20.6967860Z conv1x1_via_mm 0.1082 ms 18.9% 2025-09-07T09:50:20.6968290Z SingleProcess AUTOTUNE benchmarking takes 0.2342 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:50:20.8786306Z Autotune Choices Stats: 2025-09-07T09:50:20.8788842Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.028255999088287354, "best_triton_pos": 1, "best_triton_time": 0.05724800005555153, "best_triton_kernel": "triton_convolution2d_122", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:50:20.9042284Z AUTOTUNE convolution(8x512x28x28, 2048x512x1x1) 2025-09-07T09:50:20.9042825Z strides: [401408, 784, 28, 1], [512, 1, 1, 1] 2025-09-07T09:50:20.9043291Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:20.9043764Z convolution 0.0283 ms 100.0% 2025-09-07T09:50:20.9045558Z triton_convolution2d_122 0.0572 ms 49.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.9047760Z triton_convolution2d_121 0.0588 ms 48.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.9048902Z triton_convolution2d_123 0.0614 ms 46.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.9050223Z triton_convolution2d_118 0.0630 ms 44.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.9051361Z triton_convolution2d_124 0.0718 ms 39.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:20.9052489Z triton_convolution2d_119 0.0860 ms 32.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:20.9053623Z triton_convolution2d_120 0.1607 ms 17.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:20.9054316Z conv1x1_via_mm 0.1844 ms 15.3% 2025-09-07T09:50:20.9054755Z SingleProcess AUTOTUNE benchmarking takes 0.1949 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:50:21.0842956Z Autotune Choices Stats: 2025-09-07T09:50:21.0845596Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04179200157523155, "best_triton_pos": 1, "best_triton_time": 0.058240000158548355, "best_triton_kernel": "triton_convolution2d_129", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:50:21.1099623Z AUTOTUNE convolution(8x2048x14x14, 1024x2048x1x1) 2025-09-07T09:50:21.1100223Z strides: [401408, 196, 14, 1], [2048, 1, 1, 1] 2025-09-07T09:50:21.1100710Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:21.1101152Z convolution 0.0418 ms 100.0% 2025-09-07T09:50:21.1102510Z triton_convolution2d_129 0.0582 ms 71.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.1104544Z triton_convolution2d_128 0.0694 ms 60.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.1106894Z triton_convolution2d_130 0.0711 ms 58.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.1108112Z triton_convolution2d_131 0.0729 ms 57.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.1108867Z conv1x1_via_mm 0.0868 ms 48.2% 2025-09-07T09:50:21.1109596Z triton_convolution2d_126 0.0957 ms 43.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.1110809Z triton_convolution2d_125 0.1084 ms 38.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.1112237Z triton_convolution2d_127 0.1552 ms 26.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:21.1113201Z SingleProcess AUTOTUNE benchmarking takes 0.2040 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:50:21.2152451Z Autotune Choices Stats: 2025-09-07T09:50:21.2153710Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_136", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.022624000906944275, "best_triton_pos": 0} 2025-09-07T09:50:21.3199275Z AUTOTUNE convolution(8x512x28x28, 1024x512x1x1) 2025-09-07T09:50:21.3199801Z strides: [401408, 784, 28, 1], [512, 1, 1, 1] 2025-09-07T09:50:21.3200279Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:21.3201508Z triton_convolution2d_136 0.0226 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.3203491Z triton_convolution2d_137 0.0253 ms 89.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.3205907Z triton_convolution2d_135 0.0256 ms 88.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.3207146Z convolution 0.0284 ms 79.8% 2025-09-07T09:50:21.3207803Z triton_convolution2d_138 0.0352 ms 64.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.3208940Z triton_convolution2d_132 0.0365 ms 62.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.3210063Z triton_convolution2d_133 0.0387 ms 58.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.3211190Z triton_convolution2d_134 0.0725 ms 31.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:21.3212078Z SingleProcess AUTOTUNE benchmarking takes 0.2085 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:50:21.4712914Z Autotune Choices Stats: 2025-09-07T09:50:21.4713964Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_143", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.03759999945759773, "best_triton_pos": 0} 2025-09-07T09:50:21.5338928Z AUTOTUNE convolution(8x1024x14x14, 2048x1024x1x1) 2025-09-07T09:50:21.5339462Z strides: [200704, 196, 14, 1], [1024, 1, 1, 1] 2025-09-07T09:50:21.5339930Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:21.5341157Z triton_convolution2d_143 0.0376 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.5343268Z triton_convolution2d_142 0.0403 ms 93.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.5345863Z triton_convolution2d_144 0.0410 ms 91.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.5347961Z triton_convolution2d_145 0.0435 ms 86.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.5348715Z convolution 0.0437 ms 86.0% 2025-09-07T09:50:21.5349588Z triton_convolution2d_140 0.0539 ms 69.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.5350819Z triton_convolution2d_139 0.0592 ms 63.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.5352044Z triton_convolution2d_141 0.0916 ms 41.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:21.5352793Z conv1x1_via_mm 0.1033 ms 36.4% 2025-09-07T09:50:21.5353261Z SingleProcess AUTOTUNE benchmarking takes 0.2131 seconds and 0.0003 seconds precompiling for 9 choices 2025-09-07T09:50:21.8084916Z Autotune Choices Stats: 2025-09-07T09:50:21.8087387Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.054207999259233475, "best_triton_pos": 1, "best_triton_time": 0.07065600156784058, "best_triton_kernel": "triton_convolution2d_451", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T09:50:21.9637528Z AUTOTUNE convolution(8x1024x14x14, 4096x1024x1x1) 2025-09-07T09:50:21.9637855Z strides: [200704, 196, 14, 1], [1024, 1, 1, 1] 2025-09-07T09:50:21.9638143Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:21.9638400Z convolution 0.0542 ms 100.0% 2025-09-07T09:50:21.9639126Z triton_convolution2d_451 0.0707 ms 76.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.9640361Z triton_convolution2d_450 0.0733 ms 74.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.9641574Z triton_convolution2d_452 0.0753 ms 72.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.9642783Z triton_convolution2d_453 0.0803 ms 67.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:21.9644006Z triton_convolution2d_448 0.0997 ms 54.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.9645519Z triton_convolution2d_447 0.1068 ms 50.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:21.9646293Z conv1x1_via_mm 0.1720 ms 31.5% 2025-09-07T09:50:21.9647022Z triton_convolution2d_449 0.1789 ms 30.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:21.9648210Z SingleProcess AUTOTUNE benchmarking takes 0.3358 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:50:22.2208191Z Autotune Choices Stats: 2025-09-07T09:50:22.2210692Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04806400090456009, "best_triton_pos": 2, "best_triton_time": 0.1345279961824417, "best_triton_kernel": "triton_convolution2d_457", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T09:50:22.3992356Z AUTOTUNE convolution(8x4096x7x7, 2048x4096x1x1) 2025-09-07T09:50:22.3992672Z strides: [200704, 49, 7, 1], [4096, 1, 1, 1] 2025-09-07T09:50:22.3992948Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:22.3993209Z convolution 0.0481 ms 100.0% 2025-09-07T09:50:22.3993460Z conv1x1_via_mm 0.1103 ms 43.6% 2025-09-07T09:50:22.3994227Z triton_convolution2d_457 0.1345 ms 35.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:22.3995758Z triton_convolution2d_458 0.1476 ms 32.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:22.3996977Z triton_convolution2d_459 0.1739 ms 27.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:22.3998145Z triton_convolution2d_460 0.2046 ms 23.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:22.3999295Z triton_convolution2d_454 0.2127 ms 22.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:22.4000422Z triton_convolution2d_455 0.2547 ms 18.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:22.4001546Z triton_convolution2d_456 0.2729 ms 17.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=512, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:22.4002437Z SingleProcess AUTOTUNE benchmarking takes 0.4339 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:50:22.5227943Z Autotune Choices Stats: 2025-09-07T09:50:22.5229829Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.02739199995994568, "best_triton_pos": 1, "best_triton_time": 0.04224000126123428, "best_triton_kernel": "triton_convolution2d_464", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T09:50:22.6311214Z AUTOTUNE convolution(8x1024x14x14, 2048x1024x1x1) 2025-09-07T09:50:22.6311558Z strides: [200704, 196, 14, 1], [1024, 1, 1, 1] 2025-09-07T09:50:22.6311851Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:22.6312123Z convolution 0.0274 ms 100.0% 2025-09-07T09:50:22.6312890Z triton_convolution2d_464 0.0422 ms 64.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:22.6314125Z triton_convolution2d_465 0.0522 ms 52.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:22.6315848Z triton_convolution2d_466 0.0533 ms 51.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:22.6317086Z triton_convolution2d_461 0.0636 ms 43.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:22.6318429Z triton_convolution2d_467 0.0697 ms 39.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:22.6319560Z triton_convolution2d_462 0.0727 ms 37.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:22.6320720Z triton_convolution2d_463 0.0928 ms 29.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:22.6321619Z SingleProcess AUTOTUNE benchmarking takes 0.2302 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:50:22.8266233Z Autotune Choices Stats: 2025-09-07T09:50:22.8267749Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04201599955558777, "best_triton_pos": 1, "best_triton_time": 0.07366400212049484, "best_triton_kernel": "triton_convolution2d_471", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T09:50:22.8592314Z AUTOTUNE convolution(8x2048x7x7, 4096x2048x1x1) 2025-09-07T09:50:22.8592658Z strides: [100352, 49, 7, 1], [2048, 1, 1, 1] 2025-09-07T09:50:22.8592944Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:22.8593232Z convolution 0.0420 ms 100.0% 2025-09-07T09:50:22.8593977Z triton_convolution2d_471 0.0737 ms 57.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:22.8595366Z triton_convolution2d_472 0.0833 ms 50.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:22.8596599Z triton_convolution2d_473 0.0947 ms 44.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:22.8597823Z triton_convolution2d_468 0.1110 ms 37.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:22.8599045Z triton_convolution2d_474 0.1163 ms 36.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T09:50:22.8600263Z triton_convolution2d_469 0.1411 ms 29.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T09:50:22.8601017Z conv1x1_via_mm 0.1554 ms 27.0% 2025-09-07T09:50:22.8601740Z triton_convolution2d_470 0.1666 ms 25.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=512, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T09:50:22.8602910Z SingleProcess AUTOTUNE benchmarking takes 0.2265 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T09:50:23.1224311Z Autotune Choices Stats: 2025-09-07T09:50:23.1226392Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_500", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.010879999957978725, "best_triton_pos": 0} 2025-09-07T09:50:23.3194452Z AUTOTUNE addmm(8x1000, 8x2048, 2048x1000) 2025-09-07T09:50:23.3195129Z strides: [0, 1], [2048, 1], [1, 2048] 2025-09-07T09:50:23.3195448Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:50:23.3196153Z triton_mm_500 0.0109 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:50:23.3196829Z bias_addmm 0.0111 ms 98.0% 2025-09-07T09:50:23.3197999Z triton_mm_504 0.0115 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:50:23.3199562Z triton_mm_508 0.0140 ms 78.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:50:23.3200529Z addmm 0.0147 ms 74.2% 2025-09-07T09:50:23.3201455Z triton_mm_512 0.0153 ms 71.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:50:23.3202970Z triton_mm_499 0.0172 ms 63.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:50:23.3204475Z triton_mm_498 0.0181 ms 60.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:50:23.3206318Z triton_mm_503 0.0185 ms 58.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:50:23.3207803Z triton_mm_497 0.0187 ms 58.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:50:23.3208592Z SingleProcess AUTOTUNE benchmarking takes 0.4500 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:50:40.8860544Z Autotune Choices Stats: 2025-09-07T09:50:40.8861696Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_536", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.00687999976798892, "best_triton_pos": 0} 2025-09-07T09:50:41.2706487Z AUTOTUNE mm(1000x8, 8x2048) 2025-09-07T09:50:41.2706796Z strides: [1, 1000], [2048, 1] 2025-09-07T09:50:41.2707075Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:41.2707775Z triton_mm_536 0.0069 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:50:41.2708774Z triton_mm_535 0.0069 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:50:41.2709755Z triton_mm_537 0.0070 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:50:41.2710732Z triton_mm_538 0.0070 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:50:41.5009123Z triton_mm_539 0.0070 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:50:41.5010091Z triton_mm_540 0.0070 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:50:41.5011066Z triton_mm_542 0.0070 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:50:41.5012324Z triton_mm_541 0.0071 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:50:41.5013544Z triton_mm_543 0.0071 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:50:41.5014732Z triton_mm_544 0.0072 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:50:41.5016039Z SingleProcess AUTOTUNE benchmarking takes 0.5437 seconds and 0.0003 seconds precompiling for 17 choices 2025-09-07T09:50:42.1972741Z Autotune Choices Stats: 2025-09-07T09:50:42.1974023Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "mm", "best_time": 0.009600000455975533, "best_triton_pos": 1, "best_triton_time": 0.010015999898314476, "best_triton_kernel": "triton_mm_521", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:50:42.3652640Z AUTOTUNE mm(8x1000, 1000x2048) 2025-09-07T09:50:42.3652926Z strides: [1000, 1], [2048, 1] 2025-09-07T09:50:42.3653196Z dtypes: torch.float16, torch.float16 2025-09-07T09:50:42.3653475Z mm 0.0096 ms 100.0% 2025-09-07T09:50:42.3654114Z triton_mm_521 0.0100 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:50:42.3655443Z triton_mm_525 0.0105 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:50:42.3656470Z triton_mm_517 0.0105 ms 91.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:50:42.3657438Z triton_mm_529 0.0117 ms 81.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:50:42.3658396Z triton_mm_515 0.0119 ms 80.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:50:42.3659352Z triton_mm_516 0.0123 ms 78.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:50:42.3660295Z triton_mm_520 0.0123 ms 78.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:50:42.3661251Z triton_mm_527 0.0132 ms 72.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:50:42.3662334Z triton_mm_524 0.0134 ms 71.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:50:42.3663123Z SingleProcess AUTOTUNE benchmarking takes 0.3518 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:50:52.0433326Z pass 2025-09-07T09:50:57.2717484Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:50:57.2718879Z import pynvml # type: ignore[import] 2025-09-07T09:51:00.4298657Z 2025-09-07T09:51:03.1983367Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:51:03.1983730Z loading model: 0it [00:02, ?it/s] 2025-09-07T09:51:03.1984567Z cuda train tf_efficientnet_b0 2025-09-07T09:51:32.1269083Z Autotune Choices Stats: 2025-09-07T09:51:32.1270133Z {"num_choices": 15, "num_triton_choices": 13, "best_kernel": "triton_mm_716", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.008287999778985977, "best_triton_pos": 0} 2025-09-07T09:51:32.1366494Z AUTOTUNE addmm(8x48, 8x1152, 1152x48) 2025-09-07T09:51:32.1366818Z strides: [0, 1], [1152, 1], [1, 1152] 2025-09-07T09:51:32.1367089Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:32.1367713Z triton_mm_716 0.0083 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:32.1368600Z triton_mm_720 0.0089 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:32.1369445Z triton_mm_723 0.0091 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:32.1369979Z bias_addmm 0.0091 ms 91.2% 2025-09-07T09:51:32.1370507Z triton_mm_724 0.0096 ms 86.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:32.1371057Z addmm 0.0116 ms 71.7% 2025-09-07T09:51:32.1371546Z triton_mm_719 0.0118 ms 70.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:32.1372369Z triton_mm_715 0.0119 ms 69.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:32.1373189Z triton_mm_714 0.0120 ms 68.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:32.1374012Z triton_mm_713 0.0124 ms 66.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:51:32.1374744Z SingleProcess AUTOTUNE benchmarking takes 0.2118 seconds and 0.0003 seconds precompiling for 15 choices 2025-09-07T09:51:32.6062386Z Autotune Choices Stats: 2025-09-07T09:51:32.6063477Z {"num_choices": 13, "num_triton_choices": 11, "best_kernel": "triton_mm_536", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.008031999692320824, "best_triton_pos": 0} 2025-09-07T09:51:32.8246457Z AUTOTUNE addmm(8x28, 8x672, 672x28) 2025-09-07T09:51:32.8246835Z strides: [0, 1], [672, 1], [1, 672] 2025-09-07T09:51:32.8247150Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:32.8247870Z triton_mm_536 0.0080 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:32.8248867Z triton_mm_535 0.0083 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T09:51:32.8250439Z triton_mm_529 0.0084 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:32.8251437Z triton_mm_528 0.0093 ms 86.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:32.8252691Z triton_mm_532 0.0096 ms 83.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T09:51:32.8253666Z triton_mm_534 0.0102 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T09:51:32.8254672Z triton_mm_527 0.0107 ms 74.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:51:32.8255732Z bias_addmm 0.0118 ms 68.2% 2025-09-07T09:51:32.8256353Z triton_mm_533 0.0121 ms 66.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T09:51:32.8257027Z addmm 0.0140 ms 57.6% 2025-09-07T09:51:32.8257451Z SingleProcess AUTOTUNE benchmarking takes 0.3887 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T09:51:33.2447126Z Autotune Choices Stats: 2025-09-07T09:51:33.2448239Z {"num_choices": 13, "num_triton_choices": 11, "best_kernel": "triton_mm_349", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2", "best_time": 0.007424000184983015, "best_triton_pos": 0} 2025-09-07T09:51:33.2817599Z AUTOTUNE addmm(8x20, 8x480, 480x20) 2025-09-07T09:51:33.2817908Z strides: [0, 1], [480, 1], [1, 480] 2025-09-07T09:51:33.2818228Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:33.2818963Z triton_mm_349 0.0074 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T09:51:33.2820038Z triton_mm_350 0.0075 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:33.2821077Z triton_mm_343 0.0075 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:33.2822246Z triton_mm_342 0.0084 ms 87.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:33.2823302Z triton_mm_346 0.0085 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T09:51:33.2824339Z triton_mm_348 0.0089 ms 83.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T09:51:33.2825774Z triton_mm_341 0.0090 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:51:33.2826534Z bias_addmm 0.0102 ms 72.7% 2025-09-07T09:51:33.2827281Z triton_mm_347 0.0103 ms 72.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T09:51:33.2828000Z addmm 0.0132 ms 56.4% 2025-09-07T09:51:33.2828974Z SingleProcess AUTOTUNE benchmarking takes 0.2089 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T09:51:33.6807280Z Autotune Choices Stats: 2025-09-07T09:51:33.6808352Z {"num_choices": 13, "num_triton_choices": 11, "best_kernel": "triton_mm_229", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1", "best_time": 0.006783999968320131, "best_triton_pos": 0} 2025-09-07T09:51:33.7432718Z AUTOTUNE addmm(8x10, 8x240, 240x10) 2025-09-07T09:51:33.7433432Z strides: [0, 1], [240, 1], [1, 240] 2025-09-07T09:51:33.7433789Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:33.7434635Z triton_mm_229 0.0068 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:51:33.7436188Z triton_mm_228 0.0069 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:51:33.7437367Z triton_mm_221 0.0070 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:51:33.7438515Z triton_mm_222 0.0070 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:51:33.7439671Z triton_mm_225 0.0071 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:51:33.7440819Z triton_mm_227 0.0072 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:51:33.7441955Z triton_mm_220 0.0075 ms 91.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:51:33.7443111Z triton_mm_226 0.0077 ms 87.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:51:33.7443840Z bias_addmm 0.0085 ms 80.0% 2025-09-07T09:51:33.7444538Z triton_mm_224 0.0092 ms 73.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:51:33.7445679Z SingleProcess AUTOTUNE benchmarking takes 0.2250 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T09:51:33.9900913Z Autotune Choices Stats: 2025-09-07T09:51:33.9902110Z {"num_choices": 8, "num_triton_choices": 6, "best_kernel": "triton_mm_7", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1", "best_time": 0.0058559998869895935, "best_triton_pos": 0} 2025-09-07T09:51:34.1933879Z AUTOTUNE addmm(8x8, 8x32, 32x8) 2025-09-07T09:51:34.1934366Z strides: [0, 1], [32, 1], [1, 32] 2025-09-07T09:51:34.1934872Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:34.1936478Z triton_mm_7 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:51:34.1937825Z triton_mm_8 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:51:34.1938738Z triton_mm_10 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:51:34.1939622Z triton_mm_11 0.0059 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:51:34.1940873Z triton_mm_6 0.0063 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=1 2025-09-07T09:51:34.1941836Z triton_mm_9 0.0063 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:51:34.1942401Z bias_addmm 0.0074 ms 79.6% 2025-09-07T09:51:34.1942836Z addmm 0.0100 ms 58.7% 2025-09-07T09:51:34.1943256Z SingleProcess AUTOTUNE benchmarking takes 0.3197 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T09:51:34.4911575Z Autotune Choices Stats: 2025-09-07T09:51:34.4912612Z {"num_choices": 13, "num_triton_choices": 11, "best_kernel": "triton_mm_104", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1", "best_time": 0.006496000103652477, "best_triton_pos": 0} 2025-09-07T09:51:34.6278359Z AUTOTUNE addmm(8x6, 8x144, 144x6) 2025-09-07T09:51:34.6278762Z strides: [0, 1], [144, 1], [1, 144] 2025-09-07T09:51:34.6279177Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:34.6280124Z triton_mm_104 0.0065 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:51:34.6281510Z triton_mm_110 0.0067 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:51:34.6282843Z triton_mm_112 0.0067 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:51:34.6284182Z triton_mm_108 0.0067 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:51:34.6285969Z triton_mm_111 0.0068 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:51:34.6287323Z triton_mm_105 0.0068 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:51:34.6288666Z triton_mm_109 0.0069 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:51:34.6290014Z triton_mm_103 0.0075 ms 86.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:51:34.6291351Z triton_mm_107 0.0079 ms 81.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:51:34.6292197Z bias_addmm 0.0086 ms 75.7% 2025-09-07T09:51:34.6292837Z SingleProcess AUTOTUNE benchmarking takes 0.3116 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T09:51:34.9377805Z Autotune Choices Stats: 2025-09-07T09:51:34.9378891Z {"num_choices": 13, "num_triton_choices": 11, "best_kernel": "triton_mm_53", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1", "best_time": 0.006304000038653612, "best_triton_pos": 0} 2025-09-07T09:51:35.0594452Z AUTOTUNE addmm(8x4, 8x96, 96x4) 2025-09-07T09:51:35.0594764Z strides: [0, 1], [96, 1], [1, 96] 2025-09-07T09:51:35.0595251Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:35.0596392Z triton_mm_53 0.0063 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:51:35.0597439Z triton_mm_47 0.0063 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:51:35.0598510Z triton_mm_46 0.0064 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:51:35.0599812Z triton_mm_54 0.0064 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:51:35.0600855Z triton_mm_48 0.0065 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:51:35.0601891Z triton_mm_51 0.0065 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:51:35.0602925Z triton_mm_55 0.0065 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:51:35.0603959Z triton_mm_52 0.0067 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:51:35.0605110Z triton_mm_50 0.0073 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:51:35.0605766Z bias_addmm 0.0076 ms 82.8% 2025-09-07T09:51:35.0606296Z SingleProcess AUTOTUNE benchmarking takes 0.2921 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T09:51:36.0314604Z Autotune Choices Stats: 2025-09-07T09:51:36.0316474Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.01616000011563301, "best_triton_pos": 1, "best_triton_time": 0.02208000048995018, "best_triton_kernel": "triton_convolution2d_4", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T09:51:36.2354582Z AUTOTUNE convolution(8x3x225x225, 32x3x3x3) 2025-09-07T09:51:36.2355337Z strides: [151875, 1, 675, 3], [27, 1, 9, 3] 2025-09-07T09:51:36.2355637Z dtypes: torch.float16, torch.float16 2025-09-07T09:51:36.2355896Z convolution 0.0162 ms 100.0% 2025-09-07T09:51:36.2356629Z triton_convolution2d_4 0.0221 ms 73.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:51:36.2357874Z triton_convolution2d_0 0.0242 ms 66.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:51:36.2359056Z triton_convolution2d_3 0.0245 ms 65.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:51:36.2360179Z triton_convolution2d_2 0.0255 ms 63.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:51:36.2361297Z triton_convolution2d_5 0.0284 ms 56.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:51:36.2362836Z triton_convolution2d_1 0.0436 ms 37.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:51:36.2363722Z SingleProcess AUTOTUNE benchmarking takes 0.2965 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T09:51:36.3405731Z Autotune Choices Stats: 2025-09-07T09:51:36.3407189Z {"num_choices": 7, "num_triton_choices": 5, "best_kernel": "triton_mm_12", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.0058559998869895935, "best_triton_pos": 0} 2025-09-07T09:51:36.4508873Z AUTOTUNE addmm(8x32, 8x8, 8x32) 2025-09-07T09:51:36.4509183Z strides: [0, 1], [8, 1], [1, 8] 2025-09-07T09:51:36.4509493Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:36.4510257Z triton_mm_12 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:51:36.4511345Z triton_mm_13 0.0059 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:51:36.4512393Z triton_mm_16 0.0059 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T09:51:36.4513432Z triton_mm_14 0.0059 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:36.4514454Z triton_mm_15 0.0059 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T09:51:36.4515475Z bias_addmm 0.0077 ms 76.3% 2025-09-07T09:51:36.4515727Z addmm 0.0098 ms 59.8% 2025-09-07T09:51:36.4516215Z SingleProcess AUTOTUNE benchmarking takes 0.2141 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T09:51:36.6524795Z Autotune Choices Stats: 2025-09-07T09:51:36.6526091Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_29", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.013791999779641628, "best_triton_pos": 0} 2025-09-07T09:51:36.6703110Z AUTOTUNE mm(100352x16, 16x96) 2025-09-07T09:51:36.6703369Z strides: [16, 1], [1, 16] 2025-09-07T09:51:36.6703629Z dtypes: torch.float16, torch.float16 2025-09-07T09:51:36.6704329Z triton_mm_29 0.0138 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:51:36.6705530Z triton_mm_34 0.0139 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:36.6706511Z triton_mm_43 0.0139 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:36.6707519Z triton_mm_38 0.0140 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:36.6708564Z triton_mm_36 0.0142 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:36.6709168Z mm 0.0142 ms 96.9% 2025-09-07T09:51:36.6709732Z triton_mm_37 0.0144 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:36.6711187Z triton_mm_39 0.0145 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:36.6712161Z triton_mm_41 0.0145 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:36.6713352Z triton_mm_33 0.0147 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:36.6714186Z SingleProcess AUTOTUNE benchmarking takes 0.2171 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T09:51:36.8514492Z Autotune Choices Stats: 2025-09-07T09:51:36.8515889Z {"num_choices": 14, "num_triton_choices": 12, "best_kernel": "triton_mm_56", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.005919999908655882, "best_triton_pos": 0} 2025-09-07T09:51:36.8921287Z AUTOTUNE addmm(8x96, 8x4, 4x96) 2025-09-07T09:51:36.8921584Z strides: [0, 1], [4, 1], [1, 4] 2025-09-07T09:51:36.8921888Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:36.8922594Z triton_mm_56 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:51:36.8923614Z triton_mm_58 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:36.8924583Z triton_mm_60 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:36.8925961Z triton_mm_65 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:36.8926934Z triton_mm_61 0.0060 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:36.8927972Z triton_mm_66 0.0060 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:51:36.8928868Z triton_mm_57 0.0060 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:51:36.8929761Z triton_mm_67 0.0060 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:36.8930654Z triton_mm_59 0.0060 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:36.8931537Z triton_mm_63 0.0061 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:36.8932318Z SingleProcess AUTOTUNE benchmarking takes 0.2212 seconds and 0.0002 seconds precompiling for 14 choices 2025-09-07T09:51:37.1048616Z Autotune Choices Stats: 2025-09-07T09:51:37.1049691Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_75", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.009440000168979168, "best_triton_pos": 0} 2025-09-07T09:51:37.1123325Z AUTOTUNE mm(25088x96, 96x24) 2025-09-07T09:51:37.1124182Z strides: [96, 1], [1, 96] 2025-09-07T09:51:37.1124430Z dtypes: torch.float16, torch.float16 2025-09-07T09:51:37.1125712Z triton_mm_75 0.0094 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:51:37.1126736Z triton_mm_77 0.0095 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:37.1128036Z triton_mm_71 0.0096 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:37.1129017Z triton_mm_78 0.0096 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:37.1129905Z triton_mm_74 0.0098 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:37.1130784Z triton_mm_76 0.0099 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:37.1131653Z triton_mm_81 0.0099 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:37.1132544Z triton_mm_84 0.0099 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:37.1133438Z triton_mm_79 0.0101 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:37.1134325Z triton_mm_80 0.0101 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:37.1135248Z SingleProcess AUTOTUNE benchmarking takes 0.2196 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T09:51:37.2965768Z Autotune Choices Stats: 2025-09-07T09:51:37.2966768Z {"num_choices": 14, "num_triton_choices": 12, "best_kernel": "triton_mm_118", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.005888000130653381, "best_triton_pos": 0} 2025-09-07T09:51:37.3104647Z AUTOTUNE addmm(8x144, 8x6, 6x144) 2025-09-07T09:51:37.3105122Z strides: [0, 1], [6, 1], [1, 6] 2025-09-07T09:51:37.3105434Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:37.3106149Z triton_mm_118 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:37.3107164Z triton_mm_115 0.0059 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:37.3108304Z triton_mm_117 0.0059 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:37.3109318Z triton_mm_113 0.0060 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:51:37.3110276Z triton_mm_114 0.0060 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:51:37.3111230Z triton_mm_116 0.0060 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:37.3112534Z triton_mm_120 0.0060 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:37.3113500Z triton_mm_122 0.0060 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:37.3114635Z triton_mm_123 0.0062 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:51:37.3115794Z triton_mm_121 0.0062 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:37.3116643Z SingleProcess AUTOTUNE benchmarking takes 0.1958 seconds and 0.0002 seconds precompiling for 14 choices 2025-09-07T09:51:37.5205867Z Autotune Choices Stats: 2025-09-07T09:51:37.5206935Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_140", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.010879999957978725, "best_triton_pos": 0} 2025-09-07T09:51:37.5262944Z AUTOTUNE mm(25088x144, 144x24) 2025-09-07T09:51:37.5263220Z strides: [144, 1], [1, 144] 2025-09-07T09:51:37.5263498Z dtypes: torch.float16, torch.float16 2025-09-07T09:51:37.5264164Z triton_mm_140 0.0109 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:37.5265624Z triton_mm_134 0.0109 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:37.5266636Z triton_mm_128 0.0110 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:37.5267636Z triton_mm_135 0.0110 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:37.5268310Z mm 0.0111 ms 98.0% 2025-09-07T09:51:37.5268869Z triton_mm_132 0.0111 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:51:37.5269823Z triton_mm_141 0.0113 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:37.5270784Z triton_mm_138 0.0114 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:37.5271761Z triton_mm_137 0.0115 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:37.5272716Z triton_mm_131 0.0117 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:37.5273550Z SingleProcess AUTOTUNE benchmarking takes 0.2153 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:51:37.7131300Z Autotune Choices Stats: 2025-09-07T09:51:37.7132369Z {"num_choices": 14, "num_triton_choices": 12, "best_kernel": "triton_mm_235", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.005919999908655882, "best_triton_pos": 0} 2025-09-07T09:51:37.7499633Z AUTOTUNE addmm(8x240, 8x10, 10x240) 2025-09-07T09:51:37.7499944Z strides: [0, 1], [10, 1], [1, 10] 2025-09-07T09:51:37.7500252Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:37.7500939Z triton_mm_235 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:37.7502392Z triton_mm_239 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:37.7503378Z triton_mm_232 0.0060 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:37.7504344Z triton_mm_234 0.0060 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:37.7505732Z triton_mm_237 0.0061 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:37.7506727Z triton_mm_230 0.0061 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:51:37.7507729Z triton_mm_233 0.0061 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:37.7508788Z triton_mm_231 0.0061 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:51:37.7509760Z triton_mm_240 0.0062 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:51:37.7510731Z triton_mm_241 0.0062 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:37.7511569Z SingleProcess AUTOTUNE benchmarking takes 0.2157 seconds and 0.0002 seconds precompiling for 14 choices 2025-09-07T09:51:37.9738262Z Autotune Choices Stats: 2025-09-07T09:51:37.9739593Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "mm", "best_time": 0.008799999952316284, "best_triton_pos": 1, "best_triton_time": 0.008799999952316284, "best_triton_kernel": "triton_mm_250", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:51:38.1866622Z AUTOTUNE mm(6272x240, 240x40) 2025-09-07T09:51:38.1866978Z strides: [240, 1], [1, 240] 2025-09-07T09:51:38.1867242Z dtypes: torch.float16, torch.float16 2025-09-07T09:51:38.1867515Z mm 0.0088 ms 100.0% 2025-09-07T09:51:38.1868239Z triton_mm_250 0.0088 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:38.1869288Z triton_mm_253 0.0088 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:38.1870282Z triton_mm_254 0.0088 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:38.1871255Z triton_mm_249 0.0089 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:51:38.1872735Z triton_mm_259 0.0093 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:38.1873692Z triton_mm_252 0.0094 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:38.1874661Z triton_mm_258 0.0094 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:38.1876328Z triton_mm_245 0.0094 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:38.1877315Z triton_mm_256 0.0095 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:38.1878182Z SingleProcess AUTOTUNE benchmarking takes 0.4360 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:51:38.3877011Z Autotune Choices Stats: 2025-09-07T09:51:38.3878079Z {"num_choices": 15, "num_triton_choices": 13, "best_kernel": "triton_mm_361", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.006016000173985958, "best_triton_pos": 0} 2025-09-07T09:51:38.4016185Z AUTOTUNE addmm(8x480, 8x20, 20x480) 2025-09-07T09:51:38.4016498Z strides: [0, 1], [20, 1], [1, 20] 2025-09-07T09:51:38.4016824Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:38.4017539Z triton_mm_361 0.0060 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:38.4018536Z triton_mm_357 0.0061 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:38.4019502Z triton_mm_356 0.0061 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:38.4020453Z triton_mm_353 0.0061 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:38.4021497Z triton_mm_359 0.0062 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:38.4022455Z triton_mm_362 0.0063 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:51:38.4023400Z triton_mm_354 0.0064 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:38.4024349Z triton_mm_363 0.0064 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:38.4025516Z triton_mm_351 0.0065 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:51:38.4026468Z triton_mm_352 0.0065 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:51:38.4027307Z SingleProcess AUTOTUNE benchmarking takes 0.2060 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:51:38.6398311Z Autotune Choices Stats: 2025-09-07T09:51:38.6400151Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_368", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.0080960001796484, "best_triton_pos": 0} 2025-09-07T09:51:38.8274851Z AUTOTUNE mm(1568x480, 480x80) 2025-09-07T09:51:38.8275302Z strides: [480, 1], [1, 480] 2025-09-07T09:51:38.8275556Z dtypes: torch.float16, torch.float16 2025-09-07T09:51:38.8276565Z triton_mm_368 0.0081 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:38.8277598Z triton_mm_372 0.0085 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:38.8278226Z mm 0.0087 ms 92.7% 2025-09-07T09:51:38.8278798Z triton_mm_376 0.0093 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:38.8279756Z triton_mm_367 0.0094 ms 86.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:38.8280649Z triton_mm_371 0.0095 ms 84.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:51:38.8281539Z triton_mm_366 0.0096 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:38.8282428Z triton_mm_375 0.0101 ms 79.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:38.8283325Z triton_mm_365 0.0102 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:38.8284210Z triton_mm_374 0.0102 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:38.8285138Z SingleProcess AUTOTUNE benchmarking takes 0.4253 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:51:39.0757066Z Autotune Choices Stats: 2025-09-07T09:51:39.0758102Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_492", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008287999778985977, "best_triton_pos": 0} 2025-09-07T09:51:39.2692235Z AUTOTUNE mm(1568x480, 480x112) 2025-09-07T09:51:39.2692640Z strides: [480, 1], [1, 480] 2025-09-07T09:51:39.2693045Z dtypes: torch.float16, torch.float16 2025-09-07T09:51:39.2694074Z triton_mm_492 0.0083 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:39.2695568Z mm 0.0085 ms 97.0% 2025-09-07T09:51:39.2696459Z triton_mm_496 0.0087 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:39.2697951Z triton_mm_491 0.0093 ms 89.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:39.2699246Z triton_mm_490 0.0094 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:39.2700172Z triton_mm_500 0.0096 ms 86.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:39.2701594Z triton_mm_495 0.0096 ms 86.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:51:39.2702503Z triton_mm_489 0.0101 ms 82.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:39.2703584Z triton_mm_499 0.0101 ms 82.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:39.2704478Z triton_mm_498 0.0103 ms 80.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:39.2705466Z SingleProcess AUTOTUNE benchmarking takes 0.4315 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:51:39.5047381Z Autotune Choices Stats: 2025-09-07T09:51:39.5049013Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_518", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.008415999822318554, "best_triton_pos": 0} 2025-09-07T09:51:39.7159822Z AUTOTUNE mm(1568x112, 112x672) 2025-09-07T09:51:39.7160134Z strides: [112, 1], [1, 112] 2025-09-07T09:51:39.7160435Z dtypes: torch.float16, torch.float16 2025-09-07T09:51:39.7161120Z triton_mm_518 0.0084 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:39.7161763Z mm 0.0084 ms 99.6% 2025-09-07T09:51:39.7162359Z triton_mm_520 0.0084 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:39.7163386Z triton_mm_516 0.0085 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:39.7164370Z triton_mm_519 0.0085 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:39.7165711Z triton_mm_514 0.0086 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:51:39.7166682Z triton_mm_524 0.0086 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:39.7167655Z triton_mm_517 0.0087 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:39.7168686Z triton_mm_521 0.0088 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:39.7169576Z triton_mm_515 0.0088 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:39.7170356Z SingleProcess AUTOTUNE benchmarking takes 0.4462 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:51:39.9085569Z Autotune Choices Stats: 2025-09-07T09:51:39.9087155Z {"num_choices": 15, "num_triton_choices": 13, "best_kernel": "triton_mm_543", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.006111999973654747, "best_triton_pos": 0} 2025-09-07T09:51:39.9470045Z AUTOTUNE addmm(8x672, 8x28, 28x672) 2025-09-07T09:51:39.9470313Z strides: [0, 1], [28, 1], [1, 28] 2025-09-07T09:51:39.9470599Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:39.9471287Z triton_mm_543 0.0061 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:39.9472489Z triton_mm_542 0.0061 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:39.9473501Z triton_mm_547 0.0062 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:39.9474480Z triton_mm_545 0.0062 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:39.9475614Z triton_mm_539 0.0064 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:39.9476602Z triton_mm_548 0.0064 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:51:39.9477580Z triton_mm_549 0.0064 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:39.9478555Z triton_mm_540 0.0064 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:39.9479510Z triton_mm_541 0.0064 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:39.9480416Z triton_mm_537 0.0067 ms 91.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:51:39.9481214Z SingleProcess AUTOTUNE benchmarking takes 0.2305 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:51:40.1823324Z Autotune Choices Stats: 2025-09-07T09:51:40.1824684Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.00902399979531765, "best_triton_pos": 1, "best_triton_time": 0.009184000082314014, "best_triton_kernel": "triton_mm_554", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:51:40.3936219Z AUTOTUNE mm(1568x672, 672x112) 2025-09-07T09:51:40.3936600Z strides: [672, 1], [1, 672] 2025-09-07T09:51:40.3936870Z dtypes: torch.float16, torch.float16 2025-09-07T09:51:40.3937149Z mm 0.0090 ms 100.0% 2025-09-07T09:51:40.3937793Z triton_mm_554 0.0092 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:40.3938840Z triton_mm_558 0.0095 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:40.3950401Z triton_mm_553 0.0105 ms 86.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:40.3951252Z triton_mm_552 0.0107 ms 84.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:40.3952570Z triton_mm_562 0.0107 ms 84.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:40.3953398Z triton_mm_557 0.0110 ms 82.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:51:40.3954412Z triton_mm_561 0.0116 ms 77.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:40.3955412Z triton_mm_560 0.0118 ms 76.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:40.3956216Z triton_mm_551 0.0119 ms 76.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:40.3956949Z SingleProcess AUTOTUNE benchmarking takes 0.4459 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:51:40.9818845Z Autotune Choices Stats: 2025-09-07T09:51:40.9820117Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.008191999979317188, "best_triton_pos": 1, "best_triton_time": 0.008511999621987343, "best_triton_kernel": "triton_mm_678", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:51:41.0829257Z AUTOTUNE mm(392x672, 672x192) 2025-09-07T09:51:41.0829548Z strides: [672, 1], [1, 672] 2025-09-07T09:51:41.0829865Z dtypes: torch.float16, torch.float16 2025-09-07T09:51:41.0830192Z mm 0.0082 ms 100.0% 2025-09-07T09:51:41.0830807Z triton_mm_678 0.0085 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:41.0831789Z triton_mm_682 0.0087 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:41.0832758Z triton_mm_677 0.0096 ms 85.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:41.0833723Z triton_mm_676 0.0100 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:41.0834681Z triton_mm_681 0.0102 ms 80.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:51:41.0837449Z triton_mm_686 0.0103 ms 79.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:41.0838456Z triton_mm_675 0.0110 ms 74.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:41.0839433Z triton_mm_684 0.0114 ms 72.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:41.0840333Z triton_mm_685 0.0114 ms 72.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:41.0841069Z SingleProcess AUTOTUNE benchmarking takes 0.6772 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:51:41.3351287Z Autotune Choices Stats: 2025-09-07T09:51:41.3352263Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_739", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.006591999903321266, "best_triton_pos": 0} 2025-09-07T09:51:41.5315379Z AUTOTUNE addmm(8x1152, 8x48, 48x1152) 2025-09-07T09:51:41.5317875Z strides: [0, 1], [48, 1], [1, 48] 2025-09-07T09:51:41.5318173Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:51:41.5319160Z triton_mm_739 0.0066 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:41.5320145Z triton_mm_727 0.0066 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:41.5321032Z triton_mm_728 0.0066 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:51:41.5321919Z triton_mm_732 0.0066 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:41.5322802Z triton_mm_737 0.0067 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:51:41.5323695Z triton_mm_738 0.0067 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:51:41.5324578Z triton_mm_735 0.0068 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:51:41.5325628Z triton_mm_733 0.0068 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:51:41.5326508Z triton_mm_731 0.0068 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:51:41.5327399Z triton_mm_741 0.0068 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:51:41.5328180Z SingleProcess AUTOTUNE benchmarking takes 0.4451 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:52:12.1110678Z pass 2025-09-07T09:52:17.3485784Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:52:17.3487048Z import pynvml # type: ignore[import] 2025-09-07T09:52:20.3304328Z 2025-09-07T09:52:21.8079333Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:52:21.8079668Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:52:21.8079942Z cuda train tf_mixnet_l 2025-09-07T09:52:59.7858508Z Autotune Choices Stats: 2025-09-07T09:52:59.7860181Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_1338", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.010495999827980995, "best_triton_pos": 0} 2025-09-07T09:52:59.9797192Z AUTOTUNE addmm(8x132, 8x1584, 1584x132) 2025-09-07T09:52:59.9797500Z strides: [0, 1], [1584, 1], [1, 1584] 2025-09-07T09:52:59.9797812Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:52:59.9798530Z triton_mm_1338 0.0105 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:52:59.9799885Z triton_mm_1342 0.0114 ms 91.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:52:59.9800876Z triton_mm_1346 0.0131 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:52:59.9802106Z triton_mm_1350 0.0142 ms 74.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:52:59.9803084Z triton_mm_1337 0.0142 ms 73.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:52:59.9804091Z triton_mm_1336 0.0149 ms 70.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:52:59.9805336Z triton_mm_1341 0.0156 ms 67.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:52:59.9805959Z addmm 0.0158 ms 66.4% 2025-09-07T09:52:59.9806544Z triton_mm_1335 0.0168 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:52:59.9807523Z triton_mm_1345 0.0173 ms 60.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:52:59.9808372Z SingleProcess AUTOTUNE benchmarking takes 0.4454 seconds and 0.0004 seconds precompiling for 19 choices 2025-09-07T09:53:00.5182367Z Autotune Choices Stats: 2025-09-07T09:53:00.5183416Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_955", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.007712000049650669, "best_triton_pos": 0} 2025-09-07T09:53:00.7219493Z AUTOTUNE addmm(8x80, 8x480, 480x80) 2025-09-07T09:53:00.7219812Z strides: [0, 1], [480, 1], [1, 480] 2025-09-07T09:53:00.7220094Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:00.7220791Z triton_mm_955 0.0077 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:00.7221943Z triton_mm_959 0.0080 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:00.7222663Z bias_addmm 0.0081 ms 95.3% 2025-09-07T09:53:00.7223363Z triton_mm_954 0.0082 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:00.7224392Z triton_mm_963 0.0084 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:00.7225728Z triton_mm_953 0.0086 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:00.7226703Z triton_mm_967 0.0090 ms 85.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:00.7227664Z triton_mm_958 0.0091 ms 84.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:00.7229071Z triton_mm_952 0.0093 ms 82.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:00.7230040Z triton_mm_965 0.0093 ms 82.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:00.7230891Z SingleProcess AUTOTUNE benchmarking takes 0.4455 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:53:01.2123695Z Autotune Choices Stats: 2025-09-07T09:53:01.2125883Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "bias_addmm", "best_time": 0.008767999708652496, "best_triton_pos": 1, "best_triton_time": 0.009119999594986439, "best_triton_kernel": "triton_mm_1266", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2"} 2025-09-07T09:53:01.3933469Z AUTOTUNE addmm(8x80, 8x960, 960x80) 2025-09-07T09:53:01.3933812Z strides: [0, 1], [960, 1], [1, 960] 2025-09-07T09:53:01.3934125Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:01.3934472Z bias_addmm 0.0088 ms 100.0% 2025-09-07T09:53:01.3935344Z triton_mm_1266 0.0091 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:01.3936357Z triton_mm_1270 0.0091 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:01.3937370Z triton_mm_1274 0.0101 ms 87.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:01.3938374Z triton_mm_1265 0.0105 ms 83.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:01.3939357Z triton_mm_1278 0.0106 ms 82.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:01.3940322Z triton_mm_1269 0.0112 ms 78.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:01.3940952Z addmm 0.0113 ms 77.8% 2025-09-07T09:53:01.3941680Z triton_mm_1264 0.0113 ms 77.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:01.3942639Z triton_mm_1273 0.0116 ms 75.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:01.3943378Z SingleProcess AUTOTUNE benchmarking takes 0.4191 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:53:01.6136566Z Autotune Choices Stats: 2025-09-07T09:53:01.6137638Z {"num_choices": 15, "num_triton_choices": 13, "best_kernel": "triton_mm_868", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.007615999784320593, "best_triton_pos": 0} 2025-09-07T09:53:01.6200573Z AUTOTUNE addmm(8x52, 8x624, 624x52) 2025-09-07T09:53:01.6200911Z strides: [0, 1], [624, 1], [1, 624] 2025-09-07T09:53:01.6201214Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:01.6202022Z triton_mm_868 0.0076 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:01.6203170Z triton_mm_872 0.0080 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:01.6204626Z triton_mm_876 0.0082 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:01.6206032Z triton_mm_875 0.0083 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:01.6207297Z triton_mm_867 0.0091 ms 83.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:01.6208291Z triton_mm_866 0.0093 ms 81.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:01.6209267Z triton_mm_871 0.0096 ms 79.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:01.6210250Z triton_mm_865 0.0102 ms 74.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:01.6211233Z triton_mm_874 0.0103 ms 74.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:01.6211880Z bias_addmm 0.0112 ms 68.2% 2025-09-07T09:53:01.6212277Z SingleProcess AUTOTUNE benchmarking takes 0.2041 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:53:02.0028037Z Autotune Choices Stats: 2025-09-07T09:53:02.0029017Z {"num_choices": 13, "num_triton_choices": 11, "best_kernel": "triton_mm_257", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2", "best_time": 0.007071999832987785, "best_triton_pos": 0} 2025-09-07T09:53:02.0661937Z AUTOTUNE addmm(8x28, 8x336, 336x28) 2025-09-07T09:53:02.0662274Z strides: [0, 1], [336, 1], [1, 336] 2025-09-07T09:53:02.0662555Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:02.0663204Z triton_mm_257 0.0071 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T09:53:02.0664129Z triton_mm_251 0.0071 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:02.0665392Z triton_mm_258 0.0071 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:02.0666300Z triton_mm_250 0.0075 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:02.0667196Z triton_mm_254 0.0076 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T09:53:02.0668078Z triton_mm_256 0.0079 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T09:53:02.0668969Z triton_mm_249 0.0081 ms 87.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:02.0669851Z triton_mm_255 0.0086 ms 81.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T09:53:02.0670787Z bias_addmm 0.0094 ms 74.9% 2025-09-07T09:53:02.0671328Z triton_mm_253 0.0109 ms 64.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:02.0672101Z SingleProcess AUTOTUNE benchmarking takes 0.2332 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T09:53:02.2972807Z Autotune Choices Stats: 2025-09-07T09:53:02.2974414Z {"num_choices": 13, "num_triton_choices": 11, "best_kernel": "triton_mm_592", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.007648000027984381, "best_triton_pos": 0} 2025-09-07T09:53:02.5188941Z AUTOTUNE addmm(8x26, 8x624, 624x26) 2025-09-07T09:53:02.5189305Z strides: [0, 1], [624, 1], [1, 624] 2025-09-07T09:53:02.5189621Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:02.5190387Z triton_mm_592 0.0076 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:02.5191434Z triton_mm_598 0.0077 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T09:53:02.5192648Z triton_mm_599 0.0077 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:02.5193701Z triton_mm_591 0.0090 ms 84.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:02.5194740Z triton_mm_595 0.0094 ms 81.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T09:53:02.5196149Z triton_mm_597 0.0096 ms 79.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T09:53:02.5197202Z triton_mm_590 0.0102 ms 74.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:02.5197873Z bias_addmm 0.0113 ms 67.9% 2025-09-07T09:53:02.5198515Z triton_mm_596 0.0115 ms 66.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T09:53:02.5199170Z addmm 0.0148 ms 51.8% 2025-09-07T09:53:02.5199653Z SingleProcess AUTOTUNE benchmarking takes 0.3948 seconds and 0.0003 seconds precompiling for 13 choices 2025-09-07T09:53:02.9359663Z Autotune Choices Stats: 2025-09-07T09:53:02.9360848Z {"num_choices": 13, "num_triton_choices": 11, "best_kernel": "triton_mm_182", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.0066559999249875546, "best_triton_pos": 0} 2025-09-07T09:53:02.9681091Z AUTOTUNE addmm(8x20, 8x240, 240x20) 2025-09-07T09:53:02.9681484Z strides: [0, 1], [240, 1], [1, 240] 2025-09-07T09:53:02.9681889Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:02.9682951Z triton_mm_182 0.0067 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:02.9684179Z triton_mm_175 0.0069 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:02.9685785Z triton_mm_181 0.0070 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T09:53:02.9687382Z triton_mm_180 0.0071 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T09:53:02.9688588Z triton_mm_178 0.0071 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T09:53:02.9690036Z triton_mm_174 0.0072 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:02.9691229Z triton_mm_173 0.0073 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:02.9692531Z triton_mm_179 0.0080 ms 83.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T09:53:02.9693263Z bias_addmm 0.0087 ms 76.5% 2025-09-07T09:53:02.9693973Z triton_mm_177 0.0097 ms 68.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:02.9695139Z SingleProcess AUTOTUNE benchmarking takes 0.2041 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T09:53:03.1656205Z Autotune Choices Stats: 2025-09-07T09:53:03.1657351Z {"num_choices": 13, "num_triton_choices": 11, "best_kernel": "triton_mm_512", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1", "best_time": 0.006912000011652708, "best_triton_pos": 0} 2025-09-07T09:53:03.1939239Z AUTOTUNE addmm(8x14, 8x336, 336x14) 2025-09-07T09:53:03.1939561Z strides: [0, 1], [336, 1], [1, 336] 2025-09-07T09:53:03.1939905Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:03.1940692Z triton_mm_512 0.0069 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:53:03.1941913Z triton_mm_518 0.0070 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:53:03.1943062Z triton_mm_511 0.0071 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:53:03.1944112Z triton_mm_519 0.0072 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:53:03.1945496Z triton_mm_515 0.0078 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:53:03.1946581Z triton_mm_510 0.0079 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:53:03.1947633Z triton_mm_517 0.0079 ms 87.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:53:03.1948676Z triton_mm_516 0.0084 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:53:03.1949341Z bias_addmm 0.0090 ms 76.6% 2025-09-07T09:53:03.1949975Z triton_mm_514 0.0111 ms 62.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:53:03.1951285Z SingleProcess AUTOTUNE benchmarking takes 0.1983 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T09:53:04.3895633Z Autotune Choices Stats: 2025-09-07T09:53:04.3896769Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_mm_17", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.009952000342309475, "best_triton_pos": 0} 2025-09-07T09:53:04.5491696Z AUTOTUNE mm(100352x32, 32x32) 2025-09-07T09:53:04.5492642Z strides: [32, 1], [1, 32] 2025-09-07T09:53:04.5493100Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:04.5493812Z triton_mm_17 0.0100 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:04.5494884Z triton_mm_20 0.0100 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:04.5496195Z triton_mm_19 0.0101 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:53:04.5497232Z triton_mm_16 0.0102 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:04.5498256Z triton_mm_12 0.0103 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:04.5499272Z triton_mm_14 0.0103 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:04.5500293Z triton_mm_18 0.0103 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:04.5501321Z triton_mm_10 0.0104 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:04.5502445Z triton_mm_7 0.0104 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:04.5503446Z triton_mm_8 0.0104 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:04.5504216Z SingleProcess AUTOTUNE benchmarking takes 0.3472 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T09:53:04.7601508Z Autotune Choices Stats: 2025-09-07T09:53:04.7602729Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_57", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.009375999681651592, "best_triton_pos": 0} 2025-09-07T09:53:04.9838619Z AUTOTUNE mm(25088x96, 96x20) 2025-09-07T09:53:04.9838933Z strides: [96, 1], [1, 96] 2025-09-07T09:53:04.9839216Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:04.9840001Z triton_mm_57 0.0094 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:04.9841169Z triton_mm_61 0.0094 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:04.9842355Z triton_mm_59 0.0094 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:04.9843808Z triton_mm_62 0.0095 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:04.9845164Z triton_mm_63 0.0095 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:04.9846516Z triton_mm_65 0.0095 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:04.9847669Z triton_mm_64 0.0095 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:04.9848818Z triton_mm_56 0.0095 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:04.9849958Z triton_mm_54 0.0096 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:04.9851091Z triton_mm_60 0.0096 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:04.9852089Z SingleProcess AUTOTUNE benchmarking takes 0.4299 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:53:05.1818443Z Autotune Choices Stats: 2025-09-07T09:53:05.1819415Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_97", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.007296000141650438, "best_triton_pos": 0} 2025-09-07T09:53:05.2070396Z AUTOTUNE mm(25088x20, 20x60) 2025-09-07T09:53:05.2070839Z strides: [20, 1], [1, 20] 2025-09-07T09:53:05.2071244Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:05.2072328Z triton_mm_97 0.0073 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:05.2073826Z triton_mm_94 0.0073 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:05.2074866Z triton_mm_93 0.0074 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:05.2076203Z triton_mm_95 0.0075 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:05.2077243Z triton_mm_98 0.0075 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:05.2078291Z triton_mm_96 0.0075 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:05.2079321Z triton_mm_92 0.0076 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:05.2080367Z triton_mm_100 0.0077 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:05.2081430Z triton_mm_101 0.0078 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:53:05.2082763Z triton_mm_102 0.0078 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:05.2083587Z SingleProcess AUTOTUNE benchmarking takes 0.2204 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T09:53:05.4162902Z Autotune Choices Stats: 2025-09-07T09:53:05.4164570Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_120", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.008031999692320824, "best_triton_pos": 0} 2025-09-07T09:53:05.4279558Z AUTOTUNE mm(25088x60, 60x20) 2025-09-07T09:53:05.4279852Z strides: [60, 1], [1, 60] 2025-09-07T09:53:05.4280115Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:05.4280809Z triton_mm_120 0.0080 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:05.4281877Z triton_mm_123 0.0083 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:05.4282957Z triton_mm_127 0.0083 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:05.4284023Z triton_mm_130 0.0083 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:05.4285250Z triton_mm_134 0.0083 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:05.4286290Z triton_mm_129 0.0083 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:05.4287327Z triton_mm_135 0.0084 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:05.4288360Z triton_mm_126 0.0084 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:05.4289392Z triton_mm_128 0.0084 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:05.4290415Z triton_mm_122 0.0085 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:05.4291342Z SingleProcess AUTOTUNE benchmarking takes 0.2181 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:53:05.6631721Z Autotune Choices Stats: 2025-09-07T09:53:05.6632865Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_169", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.012512000277638435, "best_triton_pos": 0} 2025-09-07T09:53:05.8800533Z AUTOTUNE mm(25088x40, 40x240) 2025-09-07T09:53:05.8800846Z strides: [40, 1], [1, 40] 2025-09-07T09:53:05.8801144Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:05.8801931Z triton_mm_169 0.0125 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:05.8803258Z triton_mm_164 0.0126 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:05.8804866Z triton_mm_165 0.0127 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:05.8806305Z triton_mm_161 0.0129 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:05.8807747Z triton_mm_171 0.0135 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:05.8808968Z triton_mm_170 0.0136 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:05.8810168Z triton_mm_166 0.0139 ms 90.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:05.8811395Z triton_mm_167 0.0139 ms 90.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:05.8812583Z triton_mm_160 0.0142 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:05.8813647Z triton_mm_168 0.0144 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:53:05.8814499Z SingleProcess AUTOTUNE benchmarking takes 0.4495 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:53:06.0713147Z Autotune Choices Stats: 2025-09-07T09:53:06.0714227Z {"num_choices": 15, "num_triton_choices": 13, "best_kernel": "triton_mm_188", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.005888000130653381, "best_triton_pos": 0} 2025-09-07T09:53:06.1067027Z AUTOTUNE addmm(8x240, 8x20, 20x240) 2025-09-07T09:53:06.1067410Z strides: [0, 1], [20, 1], [1, 20] 2025-09-07T09:53:06.1067739Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:06.1068524Z triton_mm_188 0.0059 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:06.1069560Z triton_mm_193 0.0060 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:06.1070557Z triton_mm_194 0.0060 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:53:06.1071553Z triton_mm_189 0.0061 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:06.1072523Z triton_mm_185 0.0061 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:06.1073677Z triton_mm_195 0.0061 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:06.1074727Z triton_mm_191 0.0062 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:06.1076115Z triton_mm_184 0.0063 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:06.1077795Z triton_mm_187 0.0064 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:06.1078843Z triton_mm_183 0.0065 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:53:06.1080037Z SingleProcess AUTOTUNE benchmarking takes 0.2261 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:53:06.3276079Z Autotune Choices Stats: 2025-09-07T09:53:06.3277219Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_207", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.00863999966531992, "best_triton_pos": 0} 2025-09-07T09:53:06.3344085Z AUTOTUNE mm(6272x240, 240x56) 2025-09-07T09:53:06.3344371Z strides: [240, 1], [1, 240] 2025-09-07T09:53:06.3344636Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:06.3345476Z triton_mm_207 0.0086 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:06.3346522Z triton_mm_204 0.0088 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:06.3347539Z triton_mm_203 0.0089 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:06.3348526Z triton_mm_208 0.0089 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:06.3349167Z mm 0.0091 ms 95.4% 2025-09-07T09:53:06.3349748Z triton_mm_206 0.0091 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:06.3350728Z triton_mm_212 0.0093 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:06.3351712Z triton_mm_213 0.0094 ms 92.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:06.3352734Z triton_mm_199 0.0094 ms 91.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:06.3354006Z triton_mm_197 0.0095 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:06.3355150Z SingleProcess AUTOTUNE benchmarking takes 0.2273 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:53:06.5414539Z Autotune Choices Stats: 2025-09-07T09:53:06.5415703Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_216", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.007199999876320362, "best_triton_pos": 0} 2025-09-07T09:53:06.5627116Z AUTOTUNE mm(6272x28, 28x168) 2025-09-07T09:53:06.5627372Z strides: [28, 1], [1, 28] 2025-09-07T09:53:06.5627644Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:06.5628304Z triton_mm_216 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:06.5629612Z triton_mm_221 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:06.5630580Z triton_mm_218 0.0073 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:06.5631530Z triton_mm_217 0.0074 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:06.5632713Z triton_mm_220 0.0074 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:06.5633811Z triton_mm_222 0.0074 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:06.5634761Z triton_mm_215 0.0074 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:06.5635907Z triton_mm_223 0.0074 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:06.5636878Z triton_mm_224 0.0074 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:06.5637834Z triton_mm_227 0.0074 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:06.5638668Z SingleProcess AUTOTUNE benchmarking takes 0.2277 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:53:06.7559887Z Autotune Choices Stats: 2025-09-07T09:53:06.7560845Z {"num_choices": 15, "num_triton_choices": 13, "best_kernel": "triton_mm_265", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.005760000087320805, "best_triton_pos": 0} 2025-09-07T09:53:06.9778075Z AUTOTUNE addmm(8x336, 8x28, 28x336) 2025-09-07T09:53:06.9778539Z strides: [0, 1], [28, 1], [1, 28] 2025-09-07T09:53:06.9778865Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:06.9779630Z triton_mm_265 0.0058 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:06.9780727Z triton_mm_271 0.0058 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:06.9781899Z triton_mm_261 0.0060 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:06.9782995Z triton_mm_269 0.0060 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:06.9784156Z triton_mm_264 0.0060 ms 95.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:06.9785359Z triton_mm_267 0.0061 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:06.9786415Z triton_mm_270 0.0061 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:53:06.9787752Z triton_mm_262 0.0062 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:06.9788813Z triton_mm_259 0.0064 ms 90.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:53:06.9790102Z triton_mm_263 0.0064 ms 90.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:06.9791020Z SingleProcess AUTOTUNE benchmarking takes 0.4123 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:53:07.1922195Z Autotune Choices Stats: 2025-09-07T09:53:07.1923284Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_275", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.008767999708652496, "best_triton_pos": 0} 2025-09-07T09:53:07.2004057Z AUTOTUNE mm(6272x168, 168x28) 2025-09-07T09:53:07.2004378Z strides: [168, 1], [1, 168] 2025-09-07T09:53:07.2004683Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:07.2005580Z triton_mm_275 0.0088 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:07.2006732Z triton_mm_287 0.0089 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:07.2007845Z triton_mm_288 0.0090 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:07.2008985Z triton_mm_276 0.0090 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:07.2010107Z triton_mm_281 0.0091 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:07.2011246Z triton_mm_282 0.0091 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:07.2012406Z triton_mm_285 0.0091 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:07.2013616Z triton_mm_279 0.0092 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:07.2014658Z triton_mm_280 0.0093 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:07.2015820Z triton_mm_283 0.0093 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:07.2016746Z SingleProcess AUTOTUNE benchmarking takes 0.2221 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:53:07.4569902Z Autotune Choices Stats: 2025-09-07T09:53:07.4571111Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_497", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.00848000030964613, "best_triton_pos": 0} 2025-09-07T09:53:07.5977812Z AUTOTUNE mm(6272x56, 56x336) 2025-09-07T09:53:07.5978089Z strides: [56, 1], [1, 56] 2025-09-07T09:53:07.5978687Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:07.5979397Z triton_mm_497 0.0085 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:07.5980467Z triton_mm_501 0.0085 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:07.5981856Z triton_mm_498 0.0086 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:07.5982925Z triton_mm_502 0.0086 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:07.5984098Z triton_mm_494 0.0088 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:07.5985445Z triton_mm_508 0.0088 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:07.5986560Z triton_mm_507 0.0089 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:07.5987639Z triton_mm_504 0.0094 ms 90.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:07.5988697Z triton_mm_503 0.0097 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:07.5989756Z triton_mm_505 0.0097 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:53:07.5990688Z SingleProcess AUTOTUNE benchmarking takes 0.3737 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:53:08.7683610Z Autotune Choices Stats: 2025-09-07T09:53:08.7685396Z {"num_choices": 14, "num_triton_choices": 12, "best_kernel": "triton_mm_522", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.00595200015231967, "best_triton_pos": 0} 2025-09-07T09:53:08.9121865Z AUTOTUNE addmm(8x336, 8x14, 14x336) 2025-09-07T09:53:08.9122255Z strides: [0, 1], [14, 1], [1, 14] 2025-09-07T09:53:08.9122620Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:08.9123458Z triton_mm_522 0.0060 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:08.9124903Z triton_mm_530 0.0060 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:53:08.9126637Z triton_mm_520 0.0060 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:53:08.9127871Z triton_mm_521 0.0061 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:08.9129056Z triton_mm_529 0.0062 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:08.9130242Z triton_mm_524 0.0062 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:08.9131950Z triton_mm_523 0.0063 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:08.9133143Z triton_mm_525 0.0063 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:08.9134540Z triton_mm_527 0.0063 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:08.9135759Z triton_mm_531 0.0063 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:08.9136676Z SingleProcess AUTOTUNE benchmarking takes 1.3137 seconds and 0.0002 seconds precompiling for 14 choices 2025-09-07T09:53:09.1514263Z Autotune Choices Stats: 2025-09-07T09:53:09.1515716Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_536", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.007935999892652035, "best_triton_pos": 0} 2025-09-07T09:53:09.3389089Z AUTOTUNE mm(1568x336, 336x104) 2025-09-07T09:53:09.3389389Z strides: [336, 1], [1, 336] 2025-09-07T09:53:09.3389664Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:09.3390387Z triton_mm_536 0.0079 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:09.3391456Z triton_mm_540 0.0083 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:09.3392126Z mm 0.0083 ms 95.4% 2025-09-07T09:53:09.3392744Z triton_mm_535 0.0084 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:09.3393836Z triton_mm_534 0.0086 ms 92.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:09.3395352Z triton_mm_539 0.0087 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:09.3396523Z triton_mm_533 0.0091 ms 87.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:09.3397684Z triton_mm_544 0.0091 ms 87.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:09.3398841Z triton_mm_543 0.0092 ms 86.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:09.3399983Z triton_mm_542 0.0093 ms 85.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:09.3400993Z SingleProcess AUTOTUNE benchmarking takes 0.4259 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:53:09.5711492Z Autotune Choices Stats: 2025-09-07T09:53:09.5712582Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_557", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.00684799998998642, "best_triton_pos": 0} 2025-09-07T09:53:09.6171428Z AUTOTUNE mm(1568x52, 52x312) 2025-09-07T09:53:09.6171735Z strides: [52, 1], [1, 52] 2025-09-07T09:53:09.6171992Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:09.6172667Z triton_mm_557 0.0068 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:09.6174103Z triton_mm_552 0.0069 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:09.6175590Z triton_mm_553 0.0069 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:09.6176551Z triton_mm_558 0.0069 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:09.6177513Z triton_mm_559 0.0070 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:09.6178454Z triton_mm_555 0.0070 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:09.6179411Z triton_mm_562 0.0070 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:09.6180362Z triton_mm_563 0.0072 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:09.6181320Z triton_mm_560 0.0072 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:09.6182382Z triton_mm_565 0.0073 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:09.6183249Z SingleProcess AUTOTUNE benchmarking takes 0.2772 seconds and 0.0005 seconds precompiling for 20 choices 2025-09-07T09:53:09.8011308Z Autotune Choices Stats: 2025-09-07T09:53:09.8012373Z {"num_choices": 15, "num_triton_choices": 13, "best_kernel": "triton_mm_610", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.006016000173985958, "best_triton_pos": 0} 2025-09-07T09:53:09.8787373Z AUTOTUNE addmm(8x624, 8x26, 26x624) 2025-09-07T09:53:09.8787782Z strides: [0, 1], [26, 1], [1, 26] 2025-09-07T09:53:09.8788112Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:09.8788875Z triton_mm_610 0.0060 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:09.8789887Z triton_mm_605 0.0061 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:09.8790854Z triton_mm_602 0.0061 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:09.8791818Z triton_mm_606 0.0061 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:09.8792775Z triton_mm_603 0.0062 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:09.8794538Z triton_mm_601 0.0062 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:09.8795665Z triton_mm_608 0.0064 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:09.8796889Z triton_mm_604 0.0064 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:09.8797859Z triton_mm_611 0.0064 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:53:09.8798824Z triton_mm_600 0.0065 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:53:09.8799667Z SingleProcess AUTOTUNE benchmarking takes 0.2585 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T09:53:10.0877080Z Autotune Choices Stats: 2025-09-07T09:53:10.0878104Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_617", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008031999692320824, "best_triton_pos": 0} 2025-09-07T09:53:10.0987088Z AUTOTUNE mm(1568x312, 312x52) 2025-09-07T09:53:10.0987375Z strides: [312, 1], [1, 312] 2025-09-07T09:53:10.0987632Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:10.0988335Z triton_mm_617 0.0080 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:10.0989366Z triton_mm_625 0.0084 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:10.0990371Z triton_mm_616 0.0085 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:10.0991327Z triton_mm_621 0.0085 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:10.0992281Z triton_mm_620 0.0087 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:10.0993237Z triton_mm_624 0.0087 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:10.0994294Z triton_mm_630 0.0089 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:10.0995774Z triton_mm_615 0.0089 ms 90.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:10.0996766Z triton_mm_623 0.0089 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:10.0997741Z triton_mm_629 0.0092 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:10.0998588Z SingleProcess AUTOTUNE benchmarking takes 0.2197 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T09:53:10.3418062Z Autotune Choices Stats: 2025-09-07T09:53:10.3419681Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_856", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.007679999805986881, "best_triton_pos": 0} 2025-09-07T09:53:10.5174841Z AUTOTUNE mm(1568x104, 104x624) 2025-09-07T09:53:10.5175347Z strides: [104, 1], [1, 104] 2025-09-07T09:53:10.5175629Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:10.5176643Z triton_mm_856 0.0077 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:10.5177660Z triton_mm_855 0.0077 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:10.5178686Z triton_mm_857 0.0079 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:10.5179677Z triton_mm_859 0.0079 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:10.5180635Z triton_mm_854 0.0080 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:10.5181710Z triton_mm_858 0.0080 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:10.5182325Z mm 0.0082 ms 93.4% 2025-09-07T09:53:10.5182911Z triton_mm_852 0.0083 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:10.5183904Z triton_mm_853 0.0083 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:10.5185119Z triton_mm_861 0.0084 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:10.5185917Z SingleProcess AUTOTUNE benchmarking takes 0.3943 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:53:10.7408710Z Autotune Choices Stats: 2025-09-07T09:53:10.7409717Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_879", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.006399999838322401, "best_triton_pos": 0} 2025-09-07T09:53:10.9595962Z AUTOTUNE addmm(8x624, 8x52, 52x624) 2025-09-07T09:53:10.9596704Z strides: [0, 1], [52, 1], [1, 52] 2025-09-07T09:53:10.9597098Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:10.9597811Z triton_mm_879 0.0064 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:10.9598800Z triton_mm_880 0.0064 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:10.9599805Z triton_mm_884 0.0064 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:10.9600769Z triton_mm_890 0.0065 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:10.9601732Z triton_mm_891 0.0065 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:10.9603353Z triton_mm_881 0.0066 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:10.9604316Z triton_mm_885 0.0066 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:10.9605957Z triton_mm_893 0.0066 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:10.9606938Z triton_mm_888 0.0067 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:10.9607916Z triton_mm_889 0.0067 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:10.9608763Z SingleProcess AUTOTUNE benchmarking takes 0.4413 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:53:11.1817242Z Autotune Choices Stats: 2025-09-07T09:53:11.1818311Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_898", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.00886400043964386, "best_triton_pos": 0} 2025-09-07T09:53:11.3818149Z AUTOTUNE mm(1568x624, 624x160) 2025-09-07T09:53:11.3818463Z strides: [624, 1], [1, 624] 2025-09-07T09:53:11.3818748Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:11.3819436Z triton_mm_898 0.0089 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:11.3820471Z triton_mm_902 0.0092 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:11.3821093Z mm 0.0095 ms 93.3% 2025-09-07T09:53:11.3821764Z triton_mm_897 0.0102 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:11.3822736Z triton_mm_901 0.0106 ms 83.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:11.3823696Z triton_mm_895 0.0110 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:11.3824812Z triton_mm_906 0.0110 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:11.3826206Z triton_mm_908 0.0115 ms 77.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:11.3827189Z triton_mm_905 0.0115 ms 77.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:11.3828155Z triton_mm_904 0.0119 ms 74.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:11.3828995Z SingleProcess AUTOTUNE benchmarking takes 0.4215 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:53:11.6153275Z Autotune Choices Stats: 2025-09-07T09:53:11.6155458Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_971", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.006783999968320131, "best_triton_pos": 0} 2025-09-07T09:53:11.8015231Z AUTOTUNE addmm(8x480, 8x80, 80x480) 2025-09-07T09:53:11.8015584Z strides: [0, 1], [80, 1], [1, 80] 2025-09-07T09:53:11.8015899Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:11.8017108Z triton_mm_971 0.0068 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:11.8018119Z triton_mm_982 0.0068 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:11.8019103Z triton_mm_975 0.0068 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:11.8020109Z triton_mm_970 0.0069 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:11.8021068Z triton_mm_981 0.0070 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:11.8022141Z triton_mm_974 0.0072 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:11.8023097Z triton_mm_978 0.0072 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:11.8024060Z triton_mm_977 0.0072 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:11.8025272Z triton_mm_984 0.0072 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:11.8026168Z triton_mm_979 0.0073 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:11.8026964Z SingleProcess AUTOTUNE benchmarking takes 0.4146 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:53:12.0469979Z Autotune Choices Stats: 2025-09-07T09:53:12.0471036Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_1253", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.008576000109314919, "best_triton_pos": 0} 2025-09-07T09:53:12.2391870Z AUTOTUNE mm(1568x160, 160x960) 2025-09-07T09:53:12.2392207Z strides: [160, 1], [1, 160] 2025-09-07T09:53:12.2392485Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:12.2393175Z triton_mm_1253 0.0086 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:12.2393832Z mm 0.0088 ms 97.5% 2025-09-07T09:53:12.2394495Z triton_mm_1257 0.0089 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:12.2395946Z triton_mm_1252 0.0090 ms 95.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:12.2396909Z triton_mm_1256 0.0090 ms 95.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:12.2398411Z triton_mm_1259 0.0090 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:12.2399379Z triton_mm_1254 0.0090 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:12.2400602Z triton_mm_1261 0.0090 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:12.2401592Z triton_mm_1260 0.0091 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:12.2402565Z triton_mm_1250 0.0094 ms 91.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:12.2403404Z SingleProcess AUTOTUNE benchmarking takes 0.4106 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:53:12.4687589Z Autotune Choices Stats: 2025-09-07T09:53:12.4688659Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_1281", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.0066559999249875546, "best_triton_pos": 0} 2025-09-07T09:53:12.6318458Z AUTOTUNE addmm(8x960, 8x80, 80x960) 2025-09-07T09:53:12.6318761Z strides: [0, 1], [80, 1], [1, 80] 2025-09-07T09:53:12.6319037Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:12.6319675Z triton_mm_1281 0.0067 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:12.6320637Z triton_mm_1282 0.0067 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:12.6321574Z triton_mm_1293 0.0068 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:12.6322498Z triton_mm_1292 0.0069 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:12.6323394Z triton_mm_1286 0.0069 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:12.6324300Z triton_mm_1295 0.0070 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:12.6325656Z triton_mm_1285 0.0071 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:12.6326557Z triton_mm_1288 0.0071 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:12.6327467Z triton_mm_1289 0.0071 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:12.6328369Z triton_mm_1290 0.0072 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:12.6329664Z SingleProcess AUTOTUNE benchmarking takes 0.3919 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:53:12.8558589Z Autotune Choices Stats: 2025-09-07T09:53:12.8559622Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_1300", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.009184000082314014, "best_triton_pos": 0} 2025-09-07T09:53:13.0360676Z AUTOTUNE mm(392x960, 960x264) 2025-09-07T09:53:13.0360987Z strides: [960, 1], [1, 960] 2025-09-07T09:53:13.0361843Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:13.0362600Z triton_mm_1300 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:13.0363252Z mm 0.0093 ms 98.6% 2025-09-07T09:53:13.0363844Z triton_mm_1304 0.0094 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:13.0364882Z triton_mm_1308 0.0105 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:13.0367729Z triton_mm_1303 0.0112 ms 82.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:13.0368582Z triton_mm_1299 0.0115 ms 79.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:13.0369406Z triton_mm_1298 0.0117 ms 78.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:13.0370239Z triton_mm_1314 0.0120 ms 76.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:13.0371085Z triton_mm_1307 0.0120 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:13.0371914Z triton_mm_1297 0.0129 ms 71.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:13.0372656Z SingleProcess AUTOTUNE benchmarking takes 0.4036 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:53:13.2551031Z Autotune Choices Stats: 2025-09-07T09:53:13.2552277Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.008960000239312649, "best_triton_pos": 1, "best_triton_time": 0.009119999594986439, "best_triton_kernel": "triton_mm_1325", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8"} 2025-09-07T09:53:13.4739143Z AUTOTUNE mm(392x264, 264x1584) 2025-09-07T09:53:13.4739469Z strides: [264, 1], [1, 264] 2025-09-07T09:53:13.4739738Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:13.4740022Z mm 0.0090 ms 100.0% 2025-09-07T09:53:13.4740655Z triton_mm_1325 0.0091 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:13.4741811Z triton_mm_1326 0.0092 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:13.4742808Z triton_mm_1329 0.0093 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:13.4744273Z triton_mm_1324 0.0097 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:13.4745604Z triton_mm_1322 0.0098 ms 91.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:13.4746851Z triton_mm_1327 0.0099 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:13.4747827Z triton_mm_1328 0.0101 ms 88.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:13.4748796Z triton_mm_1332 0.0101 ms 88.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:13.4749779Z triton_mm_1333 0.0102 ms 88.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:13.4750628Z SingleProcess AUTOTUNE benchmarking takes 0.4373 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:53:14.0862810Z Autotune Choices Stats: 2025-09-07T09:53:14.0864374Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_1354", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.007360000163316727, "best_triton_pos": 0} 2025-09-07T09:53:14.1388378Z AUTOTUNE addmm(8x1584, 8x132, 132x1584) 2025-09-07T09:53:14.1388674Z strides: [0, 1], [132, 1], [1, 132] 2025-09-07T09:53:14.1388982Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:14.1389693Z triton_mm_1354 0.0074 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:14.1390726Z triton_mm_1352 0.0075 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:14.1391715Z triton_mm_1355 0.0075 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:14.1392689Z triton_mm_1353 0.0076 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:14.1393646Z triton_mm_1365 0.0076 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:14.1394600Z triton_mm_1364 0.0076 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:14.1395707Z triton_mm_1358 0.0076 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:14.1396668Z triton_mm_1359 0.0077 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:14.1397631Z triton_mm_1362 0.0079 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:14.1398590Z triton_mm_1361 0.0081 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:14.1399781Z SingleProcess AUTOTUNE benchmarking takes 0.6629 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:53:14.3613189Z Autotune Choices Stats: 2025-09-07T09:53:14.3614844Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_1372", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.00940799992531538, "best_triton_pos": 0} 2025-09-07T09:53:14.5933994Z AUTOTUNE mm(392x792, 792x132) 2025-09-07T09:53:14.5934325Z strides: [792, 1], [1, 792] 2025-09-07T09:53:14.5934603Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:14.5935699Z triton_mm_1372 0.0094 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:14.5936578Z triton_mm_1376 0.0103 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:14.5937422Z triton_mm_1371 0.0105 ms 89.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:14.5938256Z triton_mm_1375 0.0107 ms 88.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:14.5939087Z triton_mm_1370 0.0109 ms 86.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:14.5939922Z triton_mm_1380 0.0118 ms 79.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:14.5940763Z triton_mm_1369 0.0120 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:14.5941294Z mm 0.0122 ms 77.2% 2025-09-07T09:53:14.5941891Z triton_mm_1379 0.0124 ms 76.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:14.5942741Z triton_mm_1378 0.0128 ms 73.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:14.5943477Z SingleProcess AUTOTUNE benchmarking takes 0.4537 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:53:14.8364837Z Autotune Choices Stats: 2025-09-07T09:53:14.8366487Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.008799999952316284, "best_triton_pos": 1, "best_triton_time": 0.009279999881982803, "best_triton_kernel": "triton_mm_1598", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8"} 2025-09-07T09:53:15.0408832Z AUTOTUNE mm(392x264, 264x1536) 2025-09-07T09:53:15.0409140Z strides: [264, 1], [1, 264] 2025-09-07T09:53:15.0409415Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:15.0409699Z mm 0.0088 ms 100.0% 2025-09-07T09:53:15.0410351Z triton_mm_1598 0.0093 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:15.0411367Z triton_mm_1599 0.0093 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:15.0412347Z triton_mm_1602 0.0094 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:15.0413680Z triton_mm_1600 0.0095 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:15.0414645Z triton_mm_1597 0.0099 ms 88.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:15.0416167Z triton_mm_1601 0.0099 ms 88.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:15.0417020Z triton_mm_1606 0.0100 ms 88.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:15.0417865Z triton_mm_1595 0.0100 ms 88.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:15.0418711Z triton_mm_1605 0.0104 ms 84.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:15.0419450Z SingleProcess AUTOTUNE benchmarking takes 0.4241 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:53:15.2757145Z Autotune Choices Stats: 2025-09-07T09:53:15.2758186Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_1611", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.009568000212311745, "best_triton_pos": 0} 2025-09-07T09:53:15.2834259Z AUTOTUNE addmm(8x1000, 8x1536, 1536x1000) 2025-09-07T09:53:15.2834623Z strides: [0, 1], [1536, 1], [1, 1536] 2025-09-07T09:53:15.2835234Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:53:15.2836066Z triton_mm_1611 0.0096 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:15.2836813Z bias_addmm 0.0105 ms 91.2% 2025-09-07T09:53:15.2837425Z triton_mm_1615 0.0105 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:15.2838400Z triton_mm_1619 0.0120 ms 79.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:15.2839386Z triton_mm_1623 0.0133 ms 72.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:15.2840009Z addmm 0.0138 ms 69.5% 2025-09-07T09:53:15.2840582Z triton_mm_1610 0.0144 ms 66.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:15.2841540Z triton_mm_1609 0.0152 ms 62.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:15.2842491Z triton_mm_1614 0.0155 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:15.2843435Z triton_mm_1608 0.0159 ms 60.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:53:15.2844301Z SingleProcess AUTOTUNE benchmarking takes 0.2408 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:53:48.2014016Z Autotune Choices Stats: 2025-09-07T09:53:48.2015547Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_1648", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.0066559999249875546, "best_triton_pos": 0} 2025-09-07T09:53:48.3510615Z AUTOTUNE mm(1000x8, 8x1536) 2025-09-07T09:53:48.3510953Z strides: [1, 1000], [1536, 1] 2025-09-07T09:53:48.3511254Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:48.3512473Z triton_mm_1648 0.0067 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:48.3513562Z triton_mm_1646 0.0067 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:53:48.3514586Z triton_mm_1647 0.0068 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:53:48.3516093Z triton_mm_1652 0.0068 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:48.3517024Z triton_mm_1651 0.0068 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:48.3517920Z triton_mm_1653 0.0068 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:48.3518827Z triton_mm_1649 0.0068 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:48.3519741Z triton_mm_1650 0.0070 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:53:48.3520651Z triton_mm_1654 0.0070 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:53:48.3521563Z triton_mm_1655 0.0070 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:48.3522365Z SingleProcess AUTOTUNE benchmarking takes 0.3070 seconds and 0.0003 seconds precompiling for 17 choices 2025-09-07T09:53:49.2570029Z Autotune Choices Stats: 2025-09-07T09:53:49.2571366Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "mm", "best_time": 0.009247999638319016, "best_triton_pos": 1, "best_triton_time": 0.009920000098645687, "best_triton_kernel": "triton_mm_1632", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:53:49.3297088Z AUTOTUNE mm(8x1000, 1000x1536) 2025-09-07T09:53:49.3297385Z strides: [1000, 1], [1536, 1] 2025-09-07T09:53:49.3297658Z dtypes: torch.float16, torch.float16 2025-09-07T09:53:49.3297945Z mm 0.0092 ms 100.0% 2025-09-07T09:53:49.3298598Z triton_mm_1632 0.0099 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:49.3299599Z triton_mm_1628 0.0101 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:49.3300590Z triton_mm_1636 0.0103 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:49.3302019Z triton_mm_1640 0.0116 ms 79.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:53:49.3302984Z triton_mm_1626 0.0116 ms 79.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:53:49.3304245Z triton_mm_1627 0.0118 ms 78.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:53:49.3305691Z triton_mm_1631 0.0125 ms 73.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:49.3306690Z triton_mm_1638 0.0131 ms 70.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:53:49.3307665Z triton_mm_1635 0.0132 ms 70.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:53:49.3308513Z SingleProcess AUTOTUNE benchmarking takes 0.4459 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:54:05.0038020Z W0907 09:54:05.002000 95042 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:54:53.2305344Z pass 2025-09-07T09:55:01.5404090Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:55:01.5405917Z import pynvml # type: ignore[import] 2025-09-07T09:55:04.5667442Z 2025-09-07T09:55:06.3374574Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:55:06.3374882Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:55:06.3375440Z cuda train tinynet_a 2025-09-07T09:55:43.0880624Z Autotune Choices Stats: 2025-09-07T09:55:43.0882169Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "convolution", "best_time": 0.015263999812304974, "best_triton_pos": 1, "best_triton_time": 0.018912000581622124, "best_triton_kernel": "triton_convolution2d_4", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T09:55:43.1319489Z AUTOTUNE convolution(8x3x192x192, 32x3x3x3) 2025-09-07T09:55:43.1319842Z strides: [110592, 1, 576, 3], [27, 1, 9, 3] 2025-09-07T09:55:43.1320142Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:43.1320445Z convolution 0.0153 ms 100.0% 2025-09-07T09:55:43.1321216Z triton_convolution2d_4 0.0189 ms 80.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:55:43.1322439Z triton_convolution2d_0 0.0217 ms 70.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:55:43.1323665Z triton_convolution2d_3 0.0254 ms 60.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:55:43.1324873Z triton_convolution2d_2 0.0279 ms 54.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:55:43.1326860Z triton_convolution2d_5 0.0355 ms 43.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:55:43.1328145Z triton_convolution2d_1 0.0412 ms 37.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:55:43.1329325Z SingleProcess AUTOTUNE benchmarking takes 0.1392 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T09:55:43.6891862Z Autotune Choices Stats: 2025-09-07T09:55:43.6892908Z {"num_choices": 13, "num_triton_choices": 12, "best_kernel": "triton_mm_23", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.00854399986565113, "best_triton_pos": 0} 2025-09-07T09:55:43.7148395Z AUTOTUNE mm(73728x32, 32x16) 2025-09-07T09:55:43.7148684Z strides: [32, 1], [1, 32] 2025-09-07T09:55:43.7148928Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:43.7149574Z triton_mm_23 0.0085 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:43.7150530Z triton_mm_19 0.0086 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:55:43.7151492Z triton_mm_20 0.0086 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:43.7152435Z triton_mm_27 0.0087 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:55:43.7153426Z triton_mm_26 0.0088 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:43.7154377Z triton_mm_28 0.0088 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:43.7155514Z triton_mm_18 0.0088 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:55:43.7156467Z triton_mm_22 0.0088 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:43.7157401Z triton_mm_24 0.0089 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:43.7158354Z triton_mm_25 0.0090 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:43.7159147Z SingleProcess AUTOTUNE benchmarking takes 0.5791 seconds and 0.0002 seconds precompiling for 13 choices 2025-09-07T09:55:43.9144540Z Autotune Choices Stats: 2025-09-07T09:55:43.9145811Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_39", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.011680000461637974, "best_triton_pos": 0} 2025-09-07T09:55:43.9349857Z AUTOTUNE mm(73728x16, 16x96) 2025-09-07T09:55:43.9350139Z strides: [16, 1], [1, 16] 2025-09-07T09:55:43.9350383Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:43.9351046Z triton_mm_39 0.0117 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:43.9352624Z triton_mm_37 0.0117 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:43.9353600Z triton_mm_29 0.0118 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:55:43.9354815Z triton_mm_36 0.0118 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:43.9355994Z triton_mm_34 0.0119 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:43.9356955Z triton_mm_43 0.0120 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:43.9357919Z triton_mm_33 0.0124 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:43.9358856Z triton_mm_40 0.0124 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:43.9359738Z triton_mm_38 0.0126 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:43.9360616Z triton_mm_41 0.0126 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:43.9361401Z SingleProcess AUTOTUNE benchmarking takes 0.2195 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T09:55:44.3793099Z Autotune Choices Stats: 2025-09-07T09:55:44.3794146Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_76", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.008671999908983707, "best_triton_pos": 0} 2025-09-07T09:55:44.6621696Z AUTOTUNE mm(18432x96, 96x24) 2025-09-07T09:55:44.6621971Z strides: [96, 1], [1, 96] 2025-09-07T09:55:44.6622222Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:44.6622831Z triton_mm_76 0.0087 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:44.6623731Z triton_mm_77 0.0087 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:44.6624632Z triton_mm_78 0.0089 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:44.6625839Z triton_mm_74 0.0089 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:44.6626718Z triton_mm_81 0.0090 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:44.6627635Z triton_mm_83 0.0090 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:44.6628609Z triton_mm_75 0.0091 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:44.6636465Z triton_mm_80 0.0091 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:44.6637416Z triton_mm_71 0.0092 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:44.6638612Z triton_mm_84 0.0093 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:44.6639437Z SingleProcess AUTOTUNE benchmarking takes 0.7242 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:55:44.8731065Z Autotune Choices Stats: 2025-09-07T09:55:44.8732635Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_91", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.0081599997356534, "best_triton_pos": 0} 2025-09-07T09:55:44.9228513Z AUTOTUNE mm(18432x24, 24x144) 2025-09-07T09:55:44.9228909Z strides: [24, 1], [1, 24] 2025-09-07T09:55:44.9229149Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:44.9229796Z triton_mm_91 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:44.9230792Z triton_mm_93 0.0083 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:44.9231767Z triton_mm_98 0.0085 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:44.9232744Z triton_mm_94 0.0086 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:44.9233692Z triton_mm_96 0.0086 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:44.9234642Z triton_mm_97 0.0087 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:44.9239030Z triton_mm_89 0.0092 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:44.9239852Z triton_mm_92 0.0094 ms 87.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:44.9240674Z triton_mm_86 0.0094 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:44.9241489Z triton_mm_95 0.0094 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:44.9242215Z SingleProcess AUTOTUNE benchmarking takes 0.2600 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:55:45.5157001Z Autotune Choices Stats: 2025-09-07T09:55:45.5158233Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_135", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.009824000298976898, "best_triton_pos": 0} 2025-09-07T09:55:45.6353731Z AUTOTUNE mm(18432x144, 144x24) 2025-09-07T09:55:45.8028436Z strides: [144, 1], [1, 144] 2025-09-07T09:55:45.8028971Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:45.8029698Z triton_mm_135 0.0098 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:45.8030786Z triton_mm_137 0.0100 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:45.8032111Z triton_mm_132 0.0100 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:45.8033142Z triton_mm_133 0.0101 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:45.8034214Z triton_mm_141 0.0101 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:45.8035664Z triton_mm_138 0.0102 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:45.8036705Z triton_mm_140 0.0103 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:45.8037771Z triton_mm_128 0.0105 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:45.8038443Z mm 0.0105 ms 93.6% 2025-09-07T09:55:45.8039015Z triton_mm_134 0.0105 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:45.8039855Z SingleProcess AUTOTUNE benchmarking takes 0.7095 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:55:45.8630668Z Autotune Choices Stats: 2025-09-07T09:55:45.8631796Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_193", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.007648000027984381, "best_triton_pos": 0} 2025-09-07T09:55:46.0546663Z AUTOTUNE mm(4608x144, 144x40) 2025-09-07T09:55:46.0547125Z strides: [144, 1], [1, 144] 2025-09-07T09:55:46.0547573Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:46.0548787Z triton_mm_193 0.0076 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:46.0549765Z triton_mm_189 0.0078 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:46.0550718Z triton_mm_191 0.0078 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:46.0551667Z triton_mm_192 0.0080 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:46.0552622Z triton_mm_196 0.0081 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:46.0553590Z triton_mm_194 0.0081 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:46.2268719Z mm 0.0082 ms 93.4% 2025-09-07T09:55:46.2269429Z triton_mm_190 0.0082 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:46.2270398Z triton_mm_198 0.0082 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:46.2271559Z triton_mm_199 0.0082 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:46.2272412Z SingleProcess AUTOTUNE benchmarking takes 0.4137 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:55:46.4574058Z Autotune Choices Stats: 2025-09-07T09:55:46.4575984Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_207", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.007455999962985516, "best_triton_pos": 0} 2025-09-07T09:55:46.5331561Z AUTOTUNE mm(4608x40, 40x240) 2025-09-07T09:55:46.5331983Z strides: [40, 1], [1, 40] 2025-09-07T09:55:46.5332379Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:46.5333427Z triton_mm_207 0.0075 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:46.5335253Z triton_mm_208 0.0075 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:46.5336801Z triton_mm_204 0.0076 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:46.5338434Z triton_mm_212 0.0076 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:46.5339799Z triton_mm_211 0.0077 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:46.5340844Z triton_mm_217 0.0079 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:46.5341991Z triton_mm_218 0.0080 ms 93.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:46.5343041Z triton_mm_206 0.0080 ms 92.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:46.5344086Z triton_mm_201 0.0081 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:46.5345248Z triton_mm_213 0.0081 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:46.5346166Z SingleProcess AUTOTUNE benchmarking takes 0.4777 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:55:47.0252273Z Autotune Choices Stats: 2025-09-07T09:55:47.0253439Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_253", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.008320000022649765, "best_triton_pos": 0} 2025-09-07T09:55:47.0358100Z AUTOTUNE mm(4608x240, 240x40) 2025-09-07T09:55:47.0358825Z strides: [240, 1], [1, 240] 2025-09-07T09:55:47.0359422Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:47.0360621Z triton_mm_253 0.0083 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:47.0362432Z triton_mm_254 0.0085 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:47.0364513Z triton_mm_250 0.0085 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:47.0366627Z triton_mm_249 0.0086 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:47.0368513Z triton_mm_259 0.0088 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:47.0369524Z mm 0.0089 ms 93.5% 2025-09-07T09:55:47.0370185Z triton_mm_252 0.0089 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:47.0371347Z triton_mm_258 0.0089 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:47.0372510Z triton_mm_243 0.0091 ms 91.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:47.0373664Z triton_mm_245 0.0092 ms 90.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:47.0374682Z SingleProcess AUTOTUNE benchmarking takes 0.4997 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:55:47.2803739Z Autotune Choices Stats: 2025-09-07T09:55:47.2805797Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_306", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.007391999941319227, "best_triton_pos": 0} 2025-09-07T09:55:47.5888576Z AUTOTUNE mm(1152x240, 240x80) 2025-09-07T09:55:47.5889158Z strides: [240, 1], [1, 240] 2025-09-07T09:55:47.5889660Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:47.5890704Z triton_mm_306 0.0074 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:47.5891591Z triton_mm_310 0.0075 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:47.5892440Z triton_mm_309 0.0076 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:47.5892958Z mm 0.0077 ms 96.3% 2025-09-07T09:55:47.5893438Z triton_mm_304 0.0077 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:47.5894290Z triton_mm_303 0.0078 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:47.5895488Z triton_mm_305 0.0078 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:47.5896554Z triton_mm_313 0.0080 ms 92.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:47.5897394Z triton_mm_314 0.0082 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:47.5898220Z triton_mm_312 0.0084 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:47.5899095Z SingleProcess AUTOTUNE benchmarking takes 0.5472 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:55:48.0510566Z Autotune Choices Stats: 2025-09-07T09:55:48.0511658Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_332", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.007391999941319227, "best_triton_pos": 0} 2025-09-07T09:55:48.5342302Z AUTOTUNE mm(1152x80, 80x480) 2025-09-07T09:55:48.5342631Z strides: [80, 1], [1, 80] 2025-09-07T09:55:48.5342894Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:48.5343575Z triton_mm_332 0.0074 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:48.5344620Z triton_mm_335 0.0076 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:48.5346045Z triton_mm_334 0.0077 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:48.5347032Z triton_mm_330 0.0077 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:48.5348018Z triton_mm_331 0.0077 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:48.5349109Z triton_mm_328 0.0078 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:48.5349810Z mm 0.0078 ms 94.3% 2025-09-07T09:55:48.5350422Z triton_mm_327 0.0080 ms 92.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:48.5351479Z triton_mm_333 0.0081 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:48.5352538Z triton_mm_324 0.0081 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:48.5353455Z SingleProcess AUTOTUNE benchmarking takes 0.9444 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:55:49.4863355Z Autotune Choices Stats: 2025-09-07T09:55:49.4864392Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_368", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.0081599997356534, "best_triton_pos": 0} 2025-09-07T09:55:49.7698441Z AUTOTUNE mm(1152x480, 480x80) 2025-09-07T09:55:49.7698741Z strides: [480, 1], [1, 480] 2025-09-07T09:55:49.7699006Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:49.7699701Z triton_mm_368 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:49.7700751Z mm 0.0084 ms 97.7% 2025-09-07T09:55:49.7701425Z triton_mm_372 0.0085 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:49.7702401Z triton_mm_367 0.0091 ms 90.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:49.7703617Z triton_mm_366 0.0092 ms 89.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:49.7704593Z triton_mm_376 0.0092 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:49.7705926Z triton_mm_371 0.0093 ms 87.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:49.7706897Z triton_mm_365 0.0096 ms 85.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:49.7707868Z triton_mm_375 0.0098 ms 83.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:49.7708829Z triton_mm_374 0.0100 ms 81.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:49.7709812Z SingleProcess AUTOTUNE benchmarking takes 1.2324 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:55:50.2404914Z Autotune Choices Stats: 2025-09-07T09:55:50.2406727Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_554", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008320000022649765, "best_triton_pos": 0} 2025-09-07T09:55:50.4926756Z AUTOTUNE mm(1152x480, 480x112) 2025-09-07T09:55:50.4927103Z strides: [480, 1], [1, 480] 2025-09-07T09:55:50.4927387Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:50.4928132Z triton_mm_554 0.0083 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:50.4929352Z triton_mm_558 0.0085 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:50.4930062Z mm 0.0086 ms 96.7% 2025-09-07T09:55:50.4930669Z triton_mm_553 0.0094 ms 88.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:50.4931720Z triton_mm_552 0.0094 ms 88.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:50.4932756Z triton_mm_557 0.0094 ms 88.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:50.4933820Z triton_mm_562 0.0095 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:50.4934881Z triton_mm_551 0.0100 ms 83.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:50.4936866Z triton_mm_560 0.0101 ms 82.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:50.4937909Z triton_mm_561 0.0101 ms 82.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:50.4938820Z SingleProcess AUTOTUNE benchmarking takes 0.7025 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:55:51.0536799Z Autotune Choices Stats: 2025-09-07T09:55:51.0538419Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_580", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.007424000184983015, "best_triton_pos": 0} 2025-09-07T09:55:51.0677651Z AUTOTUNE mm(1152x112, 112x672) 2025-09-07T09:55:51.0677974Z strides: [112, 1], [1, 112] 2025-09-07T09:55:51.0678305Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:51.0679085Z triton_mm_580 0.0074 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:51.0680412Z triton_mm_581 0.0078 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:51.0681750Z triton_mm_579 0.0079 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:51.0683058Z triton_mm_582 0.0079 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:51.0684361Z triton_mm_575 0.0081 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:51.0686119Z triton_mm_578 0.0082 ms 90.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:51.0687430Z triton_mm_587 0.0084 ms 87.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:51.0688751Z triton_mm_576 0.0085 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:51.0689978Z triton_mm_586 0.0085 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:51.0691019Z triton_mm_572 0.0086 ms 85.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:51.0691933Z SingleProcess AUTOTUNE benchmarking takes 0.5742 seconds and 0.0005 seconds precompiling for 20 choices 2025-09-07T09:55:51.3523363Z Autotune Choices Stats: 2025-09-07T09:55:51.3524870Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_616", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008960000239312649, "best_triton_pos": 0} 2025-09-07T09:55:51.4151981Z AUTOTUNE mm(1152x672, 672x112) 2025-09-07T09:55:51.4152327Z strides: [672, 1], [1, 672] 2025-09-07T09:55:51.4152637Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:51.4153416Z triton_mm_616 0.0090 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:51.4154605Z mm 0.0091 ms 98.9% 2025-09-07T09:55:51.4155454Z triton_mm_620 0.0092 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:51.4156609Z triton_mm_615 0.0103 ms 87.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:51.4157988Z triton_mm_624 0.0104 ms 85.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:51.4159141Z triton_mm_614 0.0105 ms 85.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:51.4160279Z triton_mm_619 0.0105 ms 85.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:51.4161325Z triton_mm_613 0.0114 ms 78.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:51.4162372Z triton_mm_623 0.0114 ms 78.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:51.4163416Z triton_mm_622 0.0117 ms 76.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:51.4164338Z SingleProcess AUTOTUNE benchmarking takes 0.3443 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:55:51.7110093Z Autotune Choices Stats: 2025-09-07T09:55:51.7111373Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_802", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008287999778985977, "best_triton_pos": 0} 2025-09-07T09:55:52.1177296Z AUTOTUNE mm(288x672, 672x192) 2025-09-07T09:55:52.1177759Z strides: [672, 1], [1, 672] 2025-09-07T09:55:52.1178185Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:52.1179310Z triton_mm_802 0.0083 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:52.1180429Z mm 0.0084 ms 98.1% 2025-09-07T09:55:52.1181062Z triton_mm_806 0.0085 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:52.1182129Z triton_mm_801 0.0096 ms 86.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:52.1183112Z triton_mm_800 0.0099 ms 84.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:52.1184082Z triton_mm_805 0.0099 ms 83.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:52.1185717Z triton_mm_810 0.0102 ms 81.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:52.1186719Z triton_mm_809 0.0108 ms 76.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:52.1187695Z triton_mm_799 0.0109 ms 76.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:52.1189019Z triton_mm_808 0.0111 ms 74.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:52.1189943Z SingleProcess AUTOTUNE benchmarking takes 0.6813 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:55:52.3608006Z Autotune Choices Stats: 2025-09-07T09:55:52.3610003Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_824", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.0074880002066493034, "best_triton_pos": 0} 2025-09-07T09:55:52.6304866Z AUTOTUNE mm(288x192, 192x1152) 2025-09-07T09:55:52.6305888Z strides: [192, 1], [1, 192] 2025-09-07T09:55:52.6306341Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:52.6307418Z triton_mm_824 0.0075 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:52.6309060Z triton_mm_825 0.0077 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:52.6310856Z triton_mm_828 0.0079 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:52.6311882Z triton_mm_820 0.0080 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:52.6312829Z triton_mm_827 0.0080 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:52.6313830Z triton_mm_829 0.0081 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:52.6314808Z triton_mm_831 0.0081 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:52.6315908Z triton_mm_830 0.0083 ms 90.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:52.6316511Z mm 0.0084 ms 89.7% 2025-09-07T09:55:52.6317057Z triton_mm_819 0.0084 ms 89.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:52.6317885Z SingleProcess AUTOTUNE benchmarking takes 0.5120 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T09:55:52.8687295Z Autotune Choices Stats: 2025-09-07T09:55:52.8688571Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_870", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008799999952316284, "best_triton_pos": 0} 2025-09-07T09:55:53.1936668Z AUTOTUNE mm(288x1152, 1152x192) 2025-09-07T09:55:53.1937129Z strides: [1152, 1], [1, 1152] 2025-09-07T09:55:53.1937583Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:53.1938672Z triton_mm_870 0.0088 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:53.1939707Z mm 0.0090 ms 97.5% 2025-09-07T09:55:53.1940858Z triton_mm_874 0.0095 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:53.1942737Z triton_mm_878 0.0103 ms 85.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:53.1943627Z triton_mm_869 0.0121 ms 72.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:53.1944767Z triton_mm_873 0.0125 ms 70.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:53.1945800Z triton_mm_868 0.0126 ms 69.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:53.1946696Z triton_mm_884 0.0129 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:53.1947591Z triton_mm_877 0.0130 ms 67.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:53.1948466Z triton_mm_867 0.0131 ms 67.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:53.1949240Z SingleProcess AUTOTUNE benchmarking takes 0.5597 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:55:53.4926637Z Autotune Choices Stats: 2025-09-07T09:55:53.4928169Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_1142", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008960000239312649, "best_triton_pos": 0} 2025-09-07T09:55:53.5130392Z AUTOTUNE mm(288x1152, 1152x320) 2025-09-07T09:55:53.5130671Z strides: [1152, 1], [1, 1152] 2025-09-07T09:55:53.5130935Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:53.5131605Z triton_mm_1142 0.0090 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:53.5132235Z mm 0.0092 ms 97.6% 2025-09-07T09:55:53.5132831Z triton_mm_1146 0.0096 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:53.5133821Z triton_mm_1150 0.0107 ms 83.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:53.5134806Z triton_mm_1141 0.0123 ms 73.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:53.5136170Z triton_mm_1156 0.0126 ms 71.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:53.5137147Z triton_mm_1145 0.0128 ms 70.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:53.5138115Z triton_mm_1140 0.0129 ms 69.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:53.5139073Z triton_mm_1139 0.0133 ms 67.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:53.5140066Z triton_mm_1149 0.0133 ms 67.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:53.5141219Z SingleProcess AUTOTUNE benchmarking takes 0.2912 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:55:53.9866777Z Autotune Choices Stats: 2025-09-07T09:55:53.9868272Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_1165", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008063999935984612, "best_triton_pos": 0} 2025-09-07T09:55:54.0009788Z AUTOTUNE mm(288x320, 320x1280) 2025-09-07T09:55:54.0010070Z strides: [320, 1], [1, 320] 2025-09-07T09:55:54.0010320Z dtypes: torch.float16, torch.float16 2025-09-07T09:55:54.0010974Z triton_mm_1165 0.0081 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:55:54.0011643Z mm 0.0083 ms 97.3% 2025-09-07T09:55:54.0012227Z triton_mm_1164 0.0085 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:55:54.0013215Z triton_mm_1169 0.0088 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:55:54.0014204Z triton_mm_1168 0.0090 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:55:54.0018502Z triton_mm_1160 0.0091 ms 88.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:54.0019500Z triton_mm_1167 0.0092 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:54.0020494Z triton_mm_1158 0.0092 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:55:54.0021442Z triton_mm_1171 0.0093 ms 86.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:55:54.0022263Z triton_mm_1175 0.0095 ms 84.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:55:54.0022995Z SingleProcess AUTOTUNE benchmarking takes 0.4873 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:56:27.2323793Z pass 2025-09-07T09:56:32.6479615Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:56:32.6480864Z import pynvml # type: ignore[import] 2025-09-07T09:56:35.6418181Z 2025-09-07T09:56:36.7202716Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:56:36.7203123Z loading model: 0it [00:01, ?it/s] 2025-09-07T09:56:36.7203486Z cuda train tnt_s_patch16_224 2025-09-07T09:57:13.5295582Z Autotune Choices Stats: 2025-09-07T09:57:13.5296991Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.011744000017642975, "best_triton_pos": 1, "best_triton_time": 0.012671999633312225, "best_triton_kernel": "triton_mm_239", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:57:13.5545705Z AUTOTUNE addmm(1576x1536, 1576x384, 384x1536) 2025-09-07T09:57:13.5545981Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T09:57:13.5546257Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:57:13.5546535Z bias_addmm 0.0117 ms 100.0% 2025-09-07T09:57:13.5547105Z triton_mm_239 0.0127 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:13.5548317Z triton_mm_245 0.0131 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:13.5549239Z triton_mm_237 0.0135 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:13.5550139Z triton_mm_238 0.0137 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:13.5551051Z triton_mm_241 0.0137 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:13.5551951Z triton_mm_244 0.0140 ms 83.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:13.5552862Z triton_mm_242 0.0141 ms 83.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:13.5553841Z triton_mm_235 0.0150 ms 78.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:13.5554800Z triton_mm_234 0.0159 ms 73.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:13.5555822Z SingleProcess AUTOTUNE benchmarking takes 0.2919 seconds and 0.0004 seconds precompiling for 21 choices 2025-09-07T09:57:14.2881401Z Autotune Choices Stats: 2025-09-07T09:57:14.2882472Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_94", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.008320000022649765, "best_triton_pos": 0} 2025-09-07T09:57:14.3144293Z AUTOTUNE addmm(25088x96, 25088x24, 24x96) 2025-09-07T09:57:14.3144617Z strides: [0, 1], [24, 1], [1, 24] 2025-09-07T09:57:14.3144927Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:57:14.3145881Z triton_mm_94 0.0083 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:14.3146925Z triton_mm_90 0.0085 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:14.3147899Z triton_mm_91 0.0085 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:14.3148878Z triton_mm_85 0.0086 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:14.3149840Z triton_mm_92 0.0086 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:14.3150791Z triton_mm_89 0.0087 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:14.3152066Z triton_mm_93 0.0087 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:14.3153038Z triton_mm_87 0.0087 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:14.3154326Z triton_mm_96 0.0089 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:14.3155443Z triton_mm_95 0.0092 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:57:14.3156284Z SingleProcess AUTOTUNE benchmarking takes 0.2657 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:57:14.8652423Z Autotune Choices Stats: 2025-09-07T09:57:14.8653555Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_0", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.02672000043094158, "best_triton_pos": 0} 2025-09-07T09:57:14.8762903Z AUTOTUNE convolution(8x3x224x224, 24x3x7x7) 2025-09-07T09:57:14.8763215Z strides: [150528, 50176, 224, 1], [147, 49, 7, 1] 2025-09-07T09:57:14.8763631Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:14.8765406Z triton_convolution2d_0 0.0267 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:57:14.8767256Z triton_convolution2d_4 0.0276 ms 96.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:57:14.8769069Z triton_convolution2d_3 0.0278 ms 96.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:57:14.8770869Z triton_convolution2d_5 0.0294 ms 90.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T09:57:14.8772667Z triton_convolution2d_1 0.0319 ms 83.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T09:57:14.8773846Z convolution 0.0451 ms 59.3% 2025-09-07T09:57:14.8774512Z triton_convolution2d_2 0.0836 ms 32.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T09:57:14.8775535Z SingleProcess AUTOTUNE benchmarking takes 0.1060 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T09:57:16.3033786Z Autotune Choices Stats: 2025-09-07T09:57:16.3035610Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.009247999638319016, "best_triton_pos": 1, "best_triton_time": 0.010208000428974628, "best_triton_kernel": "triton_mm_13", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T09:57:16.3524023Z AUTOTUNE addmm(1568x384, 1568x384, 384x384) 2025-09-07T09:57:16.3524430Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T09:57:16.3524816Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:57:16.3525835Z bias_addmm 0.0092 ms 100.0% 2025-09-07T09:57:16.3526468Z triton_mm_13 0.0102 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:16.3527505Z triton_mm_18 0.0103 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:16.3528778Z triton_mm_17 0.0106 ms 87.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:16.3529752Z triton_mm_24 0.0112 ms 82.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:16.3530727Z triton_mm_16 0.0112 ms 82.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:16.3531683Z triton_mm_20 0.0113 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:16.3532651Z triton_mm_7 0.0113 ms 81.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:16.3533619Z triton_mm_14 0.0115 ms 80.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:16.3534653Z triton_mm_23 0.0117 ms 78.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:16.3535608Z SingleProcess AUTOTUNE benchmarking takes 0.3144 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T09:57:16.5523443Z Autotune Choices Stats: 2025-09-07T09:57:16.5524616Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_34", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.007615999784320593, "best_triton_pos": 0} 2025-09-07T09:57:16.5608024Z AUTOTUNE mm(25088x24, 24x48) 2025-09-07T09:57:16.5608337Z strides: [24, 1], [1, 24] 2025-09-07T09:57:16.5608600Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:16.5609315Z triton_mm_34 0.0076 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:16.5610392Z triton_mm_40 0.0079 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:16.5611439Z triton_mm_36 0.0079 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:16.5612461Z triton_mm_37 0.0079 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:16.5613482Z triton_mm_35 0.0079 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:16.5614532Z triton_mm_33 0.0080 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:16.5615548Z triton_mm_31 0.0081 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:16.5616843Z triton_mm_28 0.0081 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:16.5617732Z triton_mm_32 0.0081 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:16.5618791Z triton_mm_38 0.0082 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:16.5619575Z SingleProcess AUTOTUNE benchmarking takes 0.2073 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T09:57:16.7464912Z Autotune Choices Stats: 2025-09-07T09:57:16.7466241Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_mm_54", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8", "best_time": 0.00723200011998415, "best_triton_pos": 0} 2025-09-07T09:57:16.7663508Z AUTOTUNE mm(25088x24, 24x24) 2025-09-07T09:57:16.7663770Z strides: [24, 1], [1, 24] 2025-09-07T09:57:16.7664023Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:16.7664696Z triton_mm_54 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:57:16.7665851Z triton_mm_55 0.0074 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:16.7666811Z triton_mm_41 0.0074 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:57:16.7667764Z triton_mm_44 0.0074 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:16.7668731Z triton_mm_45 0.0074 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:16.7669674Z triton_mm_49 0.0074 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:16.7670665Z triton_mm_43 0.0075 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:16.7671648Z triton_mm_51 0.0076 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:16.7672608Z triton_mm_47 0.0076 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:16.7673579Z triton_mm_53 0.0076 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:16.7674620Z SingleProcess AUTOTUNE benchmarking takes 0.2050 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T09:57:16.8415114Z Autotune Choices Stats: 2025-09-07T09:57:16.8416223Z {"num_choices": 6, "num_triton_choices": 5, "best_kernel": "triton_bmm_60", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1", "best_time": 0.009664000011980534, "best_triton_pos": 0} 2025-09-07T09:57:16.9201955Z AUTOTUNE bmm(6272x16x6, 6272x6x16) 2025-09-07T09:57:16.9202246Z strides: [96, 6, 1], [96, 16, 1] 2025-09-07T09:57:16.9202971Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:16.9203619Z triton_bmm_60 0.0097 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:57:16.9204824Z triton_bmm_56 0.0098 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=1 2025-09-07T09:57:16.9206702Z triton_bmm_57 0.0098 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:57:16.9207672Z triton_bmm_58 0.0098 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:57:16.9208619Z triton_bmm_59 0.0098 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:57:16.9209230Z bmm 0.0118 ms 81.8% 2025-09-07T09:57:16.9209680Z SingleProcess AUTOTUNE benchmarking takes 0.1533 seconds and 0.0002 seconds precompiling for 6 choices 2025-09-07T09:57:16.9950051Z Autotune Choices Stats: 2025-09-07T09:57:16.9951083Z {"num_choices": 6, "num_triton_choices": 5, "best_kernel": "triton_bmm_61", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=1", "best_time": 0.009824000298976898, "best_triton_pos": 0} 2025-09-07T09:57:17.0165205Z AUTOTUNE bmm(6272x16x16, 6272x16x6) 2025-09-07T09:57:17.0165511Z strides: [256, 16, 1], [96, 6, 1] 2025-09-07T09:57:17.0165776Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:17.0166443Z triton_bmm_61 0.0098 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=1 2025-09-07T09:57:17.0167439Z triton_bmm_63 0.0099 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:57:17.0168407Z triton_bmm_65 0.0099 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:57:17.0169381Z triton_bmm_64 0.0100 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:57:17.0170343Z triton_bmm_62 0.0100 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:57:17.0170950Z bmm 0.0121 ms 81.0% 2025-09-07T09:57:17.0171397Z SingleProcess AUTOTUNE benchmarking takes 0.0958 seconds and 0.0002 seconds precompiling for 6 choices 2025-09-07T09:57:17.2509078Z Autotune Choices Stats: 2025-09-07T09:57:17.2510426Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.009151999838650227, "best_triton_pos": 1, "best_triton_time": 0.009472000412642956, "best_triton_kernel": "triton_mm_122", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T09:57:17.3081093Z AUTOTUNE mm(1568x384, 384x384) 2025-09-07T09:57:17.3081362Z strides: [384, 1], [1, 384] 2025-09-07T09:57:17.3081632Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:17.3081893Z mm 0.0092 ms 100.0% 2025-09-07T09:57:17.3082472Z triton_mm_122 0.0095 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:17.3083449Z triton_mm_126 0.0095 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:17.3085560Z triton_mm_127 0.0097 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:17.3086559Z triton_mm_129 0.0102 ms 89.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:17.3087802Z triton_mm_125 0.0103 ms 88.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:17.3088768Z triton_mm_132 0.0104 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:17.3089743Z triton_mm_133 0.0105 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:17.3090698Z triton_mm_123 0.0105 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:17.3091650Z triton_mm_116 0.0105 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:17.3092521Z SingleProcess AUTOTUNE benchmarking takes 0.2860 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:57:17.7014652Z Autotune Choices Stats: 2025-09-07T09:57:17.7016175Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.009952000342309475, "best_triton_pos": 1, "best_triton_time": 0.010463999584317207, "best_triton_kernel": "triton_mm_145", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:57:17.7437247Z AUTOTUNE mm(1576x384, 384x768) 2025-09-07T09:57:17.7437536Z strides: [384, 1], [1, 384] 2025-09-07T09:57:17.7437787Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:17.7438053Z mm 0.0100 ms 100.0% 2025-09-07T09:57:17.7438649Z triton_mm_145 0.0105 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:17.7439637Z triton_mm_151 0.0106 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:17.7440594Z triton_mm_141 0.0106 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:17.7441570Z triton_mm_152 0.0106 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:17.7442523Z triton_mm_144 0.0110 ms 90.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:17.7443477Z triton_mm_148 0.0112 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:17.7444445Z triton_mm_143 0.0119 ms 83.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:17.7445530Z triton_mm_147 0.0120 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:17.7446783Z triton_mm_150 0.0123 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:17.7447573Z SingleProcess AUTOTUNE benchmarking takes 0.4344 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:57:17.9740776Z Autotune Choices Stats: 2025-09-07T09:57:17.9742580Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.009344000369310379, "best_triton_pos": 1, "best_triton_time": 0.009535999968647957, "best_triton_kernel": "triton_mm_164", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:57:18.0507345Z AUTOTUNE mm(1576x384, 384x384) 2025-09-07T09:57:18.0507646Z strides: [384, 1], [1, 384] 2025-09-07T09:57:18.0507915Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:18.0508222Z mm 0.0093 ms 100.0% 2025-09-07T09:57:18.0508842Z triton_mm_164 0.0095 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:18.0509869Z triton_mm_165 0.0096 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:18.0510856Z triton_mm_160 0.0096 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:18.0511805Z triton_mm_163 0.0100 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:18.0512750Z triton_mm_154 0.0104 ms 89.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:18.0513720Z triton_mm_167 0.0104 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:18.0514858Z triton_mm_171 0.0104 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:18.0516213Z triton_mm_170 0.0105 ms 88.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:18.0517163Z triton_mm_161 0.0107 ms 87.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:18.0517999Z SingleProcess AUTOTUNE benchmarking takes 0.3058 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:57:18.2813610Z Autotune Choices Stats: 2025-09-07T09:57:18.2814643Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_bmm_183", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.010751999914646149, "best_triton_pos": 0} 2025-09-07T09:57:18.3588352Z AUTOTUNE bmm(48x197x64, 48x64x197) 2025-09-07T09:57:18.3588682Z strides: [12608, 64, 1], [12608, 197, 1] 2025-09-07T09:57:18.3589004Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:18.3589681Z triton_bmm_183 0.0108 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:18.3590719Z triton_bmm_184 0.0108 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:18.3592065Z triton_bmm_189 0.0117 ms 91.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:18.3593045Z triton_bmm_188 0.0123 ms 87.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:18.3594223Z triton_bmm_176 0.0124 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:18.3595510Z triton_bmm_185 0.0125 ms 85.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:18.3596475Z triton_bmm_180 0.0127 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:18.3597444Z triton_bmm_173 0.0130 ms 83.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:18.3598420Z triton_bmm_186 0.0131 ms 82.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:18.3599381Z triton_bmm_179 0.0136 ms 79.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:18.3600214Z SingleProcess AUTOTUNE benchmarking takes 0.3070 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:57:18.5773992Z Autotune Choices Stats: 2025-09-07T09:57:18.5775482Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_bmm_202", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.011807999573647976, "best_triton_pos": 0} 2025-09-07T09:57:18.6070456Z AUTOTUNE bmm(48x197x197, 48x197x64) 2025-09-07T09:57:18.6070780Z strides: [38848, 197, 1], [12608, 64, 1] 2025-09-07T09:57:18.6071063Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:18.6071753Z triton_bmm_202 0.0118 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:18.6072748Z triton_bmm_198 0.0123 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:18.6073723Z triton_bmm_199 0.0124 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:18.6074817Z triton_bmm_203 0.0125 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:18.6075968Z triton_bmm_193 0.0127 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:18.6076886Z triton_bmm_200 0.0129 ms 91.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:18.6077780Z triton_bmm_197 0.0130 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:18.6078690Z triton_bmm_207 0.0132 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:18.6080101Z triton_bmm_205 0.0133 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:18.6080999Z triton_bmm_201 0.0135 ms 87.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:18.6081997Z SingleProcess AUTOTUNE benchmarking takes 0.2477 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:57:18.8680157Z Autotune Choices Stats: 2025-09-07T09:57:18.8681421Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.009568000212311745, "best_triton_pos": 1, "best_triton_time": 0.009727999567985535, "best_triton_kernel": "triton_mm_216", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T09:57:18.9287340Z AUTOTUNE addmm(1576x384, 1576x384, 384x384) 2025-09-07T09:57:18.9287699Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T09:57:18.9288012Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:57:18.9288339Z bias_addmm 0.0096 ms 100.0% 2025-09-07T09:57:18.9288966Z triton_mm_216 0.0097 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:18.9289960Z triton_mm_221 0.0099 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:18.9290930Z triton_mm_220 0.0106 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:18.9291890Z triton_mm_223 0.0109 ms 87.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:18.9292857Z triton_mm_210 0.0110 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:18.9293819Z triton_mm_227 0.0111 ms 86.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:18.9294871Z triton_mm_219 0.0114 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:18.9296098Z triton_mm_217 0.0114 ms 83.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:18.9297003Z triton_mm_226 0.0115 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:18.9297790Z SingleProcess AUTOTUNE benchmarking takes 0.3211 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T09:57:19.1682134Z Autotune Choices Stats: 2025-09-07T09:57:19.1683408Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.012000000104308128, "best_triton_pos": 1, "best_triton_time": 0.013407999649643898, "best_triton_kernel": "triton_mm_259", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:57:19.2350469Z AUTOTUNE mm(1576x1536, 1536x384) 2025-09-07T09:57:19.2350904Z strides: [1536, 1], [1, 1536] 2025-09-07T09:57:19.2351317Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:19.2352281Z mm 0.0120 ms 100.0% 2025-09-07T09:57:19.2353169Z triton_mm_259 0.0134 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:19.2354734Z triton_mm_265 0.0157 ms 76.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:19.2356393Z triton_mm_255 0.0159 ms 75.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:19.2357381Z triton_mm_254 0.0164 ms 73.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:19.2358339Z triton_mm_258 0.0174 ms 69.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:19.2359300Z triton_mm_251 0.0185 ms 65.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:19.2360255Z triton_mm_261 0.0194 ms 62.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:19.2361221Z triton_mm_264 0.0195 ms 61.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:19.2362186Z triton_mm_257 0.0196 ms 61.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:19.2363016Z SingleProcess AUTOTUNE benchmarking takes 0.3047 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:57:19.8895271Z Autotune Choices Stats: 2025-09-07T09:57:19.8896945Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_2921", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.007360000163316727, "best_triton_pos": 0} 2025-09-07T09:57:19.9236204Z AUTOTUNE addmm(8x1000, 8x384, 384x1000) 2025-09-07T09:57:19.9246648Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T09:57:19.9247026Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T09:57:19.9247704Z triton_mm_2921 0.0074 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:57:19.9248640Z triton_mm_2925 0.0078 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:19.9249234Z bias_addmm 0.0084 ms 87.5% 2025-09-07T09:57:19.9249782Z triton_mm_2920 0.0084 ms 87.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:57:19.9250660Z triton_mm_2919 0.0086 ms 85.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:19.9251551Z triton_mm_2924 0.0087 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:19.9252447Z triton_mm_2933 0.0088 ms 83.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:19.9253348Z triton_mm_2918 0.0089 ms 82.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T09:57:19.9254518Z triton_mm_2929 0.0092 ms 80.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:19.9255573Z triton_mm_2931 0.0094 ms 78.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:19.9256435Z SingleProcess AUTOTUNE benchmarking takes 0.2770 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:57:55.7067679Z Autotune Choices Stats: 2025-09-07T09:57:55.7068913Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.011231999844312668, "best_triton_pos": 1, "best_triton_time": 0.011615999974310398, "best_triton_kernel": "triton_mm_2978", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T09:57:55.7448830Z AUTOTUNE mm(1576x384, 384x1536) 2025-09-07T09:57:55.7449166Z strides: [384, 1], [1536, 1] 2025-09-07T09:57:55.7449445Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:55.7449718Z mm 0.0112 ms 100.0% 2025-09-07T09:57:55.7450366Z triton_mm_2978 0.0116 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:55.7451449Z triton_mm_2984 0.0120 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:55.7452465Z triton_mm_2977 0.0120 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:55.7453491Z triton_mm_2976 0.0123 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:55.7454496Z triton_mm_2981 0.0124 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:55.7455821Z triton_mm_2980 0.0126 ms 89.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:55.7456797Z triton_mm_2983 0.0127 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:55.7457772Z triton_mm_2974 0.0138 ms 81.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:55.7458742Z triton_mm_2985 0.0140 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:55.7459604Z SingleProcess AUTOTUNE benchmarking takes 0.2442 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:57:56.5410470Z Autotune Choices Stats: 2025-09-07T09:57:56.5411588Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_3271", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008224000222980976, "best_triton_pos": 0} 2025-09-07T09:57:56.8131319Z AUTOTUNE mm(25088x24, 24x96) 2025-09-07T09:57:56.8131640Z strides: [24, 1], [96, 1] 2025-09-07T09:57:56.8131903Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:56.8132606Z triton_mm_3271 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:56.8134079Z triton_mm_3275 0.0083 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:56.8135430Z triton_mm_3277 0.0083 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:56.8136629Z triton_mm_3276 0.0083 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:56.8137533Z triton_mm_3278 0.0083 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:56.8138430Z triton_mm_3273 0.0083 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:56.8139329Z triton_mm_3280 0.0083 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:56.8140236Z triton_mm_3281 0.0085 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:57:56.8141148Z triton_mm_3282 0.0085 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:56.8142142Z triton_mm_3283 0.0085 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:56.8142929Z SingleProcess AUTOTUNE benchmarking takes 0.5226 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T09:57:58.6522152Z Autotune Choices Stats: 2025-09-07T09:57:58.6523531Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.012128000147640705, "best_triton_pos": 1, "best_triton_time": 0.01398400031030178, "best_triton_kernel": "triton_mm_2998", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:57:58.6648594Z AUTOTUNE mm(384x1576, 1576x1536) 2025-09-07T09:57:58.6648908Z strides: [1, 384], [1536, 1] 2025-09-07T09:57:58.6649175Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:58.6649446Z mm 0.0121 ms 100.0% 2025-09-07T09:57:58.6650088Z triton_mm_2998 0.0140 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:58.6651147Z triton_mm_2994 0.0175 ms 69.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:58.6652169Z triton_mm_3004 0.0176 ms 68.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:58.6653176Z triton_mm_2997 0.0180 ms 67.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:58.6654190Z triton_mm_2993 0.0181 ms 67.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:58.6655386Z triton_mm_2996 0.0193 ms 62.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:58.6656761Z triton_mm_3000 0.0197 ms 61.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:58.6657635Z triton_mm_3003 0.0197 ms 61.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:58.6658706Z triton_mm_2990 0.0217 ms 56.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:58.6659457Z SingleProcess AUTOTUNE benchmarking takes 1.2585 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:57:59.0908810Z Autotune Choices Stats: 2025-09-07T09:57:59.0910123Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.012128000147640705, "best_triton_pos": 1, "best_triton_time": 0.013824000023305416, "best_triton_kernel": "triton_mm_3036", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:57:59.1622958Z AUTOTUNE mm(1536x1576, 1576x384) 2025-09-07T09:57:59.1623261Z strides: [1, 1536], [384, 1] 2025-09-07T09:57:59.1623522Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:59.1623810Z mm 0.0121 ms 100.0% 2025-09-07T09:57:59.1624471Z triton_mm_3036 0.0138 ms 87.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:57:59.1625904Z triton_mm_3032 0.0172 ms 70.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:59.1627042Z triton_mm_3031 0.0177 ms 68.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:59.1628045Z triton_mm_3042 0.0177 ms 68.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:59.1629028Z triton_mm_3035 0.0182 ms 66.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:59.1630012Z triton_mm_3034 0.0189 ms 64.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:59.1631000Z triton_mm_3041 0.0197 ms 61.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:59.1631987Z triton_mm_3038 0.0198 ms 61.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:59.1632980Z triton_mm_3028 0.0215 ms 56.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:59.1633832Z SingleProcess AUTOTUNE benchmarking takes 0.3005 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:57:59.6278413Z Autotune Choices Stats: 2025-09-07T09:57:59.6279513Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_2956", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.006207999773323536, "best_triton_pos": 0} 2025-09-07T09:57:59.6497488Z AUTOTUNE mm(1000x8, 8x384) 2025-09-07T09:57:59.6497761Z strides: [1, 1000], [384, 1] 2025-09-07T09:57:59.6498482Z dtypes: torch.float16, torch.float16 2025-09-07T09:57:59.6499154Z triton_mm_2956 0.0062 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:59.6500247Z triton_mm_2951 0.0062 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:57:59.6501569Z triton_mm_2955 0.0062 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:59.6502582Z triton_mm_2952 0.0063 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:57:59.6503572Z triton_mm_2953 0.0063 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:59.6504568Z triton_mm_2957 0.0063 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:57:59.6505943Z triton_mm_2958 0.0063 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:57:59.6507095Z triton_mm_2954 0.0064 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:57:59.6508077Z triton_mm_2959 0.0065 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:57:59.6509059Z triton_mm_2960 0.0065 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:57:59.6509923Z SingleProcess AUTOTUNE benchmarking takes 0.1801 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T09:58:00.1403274Z Autotune Choices Stats: 2025-09-07T09:58:00.1404632Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01027199998497963, "best_triton_pos": 1, "best_triton_time": 0.011392000131309032, "best_triton_kernel": "triton_mm_3199", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:58:00.2177071Z AUTOTUNE mm(768x1576, 1576x384) 2025-09-07T09:58:00.2177392Z strides: [1, 768], [384, 1] 2025-09-07T09:58:00.2177653Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:00.2177930Z mm 0.0103 ms 100.0% 2025-09-07T09:58:00.2178583Z triton_mm_3199 0.0114 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:00.2179657Z triton_mm_3203 0.0132 ms 77.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:00.2180711Z triton_mm_3195 0.0158 ms 65.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:00.2181811Z triton_mm_3198 0.0163 ms 63.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:00.2182803Z triton_mm_3202 0.0168 ms 61.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:00.2184240Z triton_mm_3209 0.0171 ms 60.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:00.2185417Z triton_mm_3194 0.0180 ms 57.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:00.2186492Z triton_mm_3201 0.0181 ms 56.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:00.2187675Z triton_mm_3205 0.0188 ms 54.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:00.2188544Z SingleProcess AUTOTUNE benchmarking takes 0.2988 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T09:58:00.7728684Z Autotune Choices Stats: 2025-09-07T09:58:00.7730040Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.009759999811649323, "best_triton_pos": 1, "best_triton_time": 0.010847999714314938, "best_triton_kernel": "triton_mm_3161", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:58:00.8073686Z AUTOTUNE mm(384x1576, 1576x384) 2025-09-07T09:58:00.8073948Z strides: [1, 384], [384, 1] 2025-09-07T09:58:00.8074194Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:00.8074471Z mm 0.0098 ms 100.0% 2025-09-07T09:58:00.8075229Z triton_mm_3161 0.0108 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:00.8076220Z triton_mm_3157 0.0111 ms 87.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:00.8077467Z triton_mm_3165 0.0130 ms 75.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:00.8078477Z triton_mm_3156 0.0152 ms 64.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:00.8079457Z triton_mm_3155 0.0155 ms 63.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:00.8080439Z triton_mm_3171 0.0160 ms 61.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:00.8081419Z triton_mm_3160 0.0160 ms 61.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:00.8082402Z triton_mm_3164 0.0164 ms 59.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:00.8083379Z triton_mm_3163 0.0183 ms 53.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:00.8084240Z SingleProcess AUTOTUNE benchmarking takes 0.2537 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:58:01.3558620Z Autotune Choices Stats: 2025-09-07T09:58:01.3559966Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.0098879998549819, "best_triton_pos": 1, "best_triton_time": 0.010847999714314938, "best_triton_kernel": "triton_mm_3256", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:58:01.3884279Z AUTOTUNE mm(384x1568, 1568x384) 2025-09-07T09:58:01.3884695Z strides: [1, 384], [384, 1] 2025-09-07T09:58:01.3885127Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:01.3885430Z mm 0.0099 ms 100.0% 2025-09-07T09:58:01.3886068Z triton_mm_3256 0.0108 ms 91.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:01.3887502Z triton_mm_3252 0.0110 ms 90.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:01.3888527Z triton_mm_3260 0.0129 ms 76.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:01.3889501Z triton_mm_3251 0.0143 ms 69.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:01.3890475Z triton_mm_3250 0.0153 ms 64.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:01.3891464Z triton_mm_3255 0.0160 ms 61.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:01.3892449Z triton_mm_3266 0.0162 ms 61.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:01.3893427Z triton_mm_3259 0.0165 ms 59.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:01.3894399Z triton_mm_3258 0.0170 ms 58.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:01.3895417Z SingleProcess AUTOTUNE benchmarking takes 0.2492 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:58:03.9312595Z Autotune Choices Stats: 2025-09-07T09:58:03.9314054Z {"num_choices": 28, "num_triton_choices": 17, "best_kernel": "decompose_k_mm_49_split_0", "best_kernel_desc": "k_split=49", "best_time": 0.013439999893307686, "best_triton_pos": 11, "best_triton_time": 0.10764800012111664, "best_triton_kernel": "triton_mm_3288", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:58:03.9829210Z AUTOTUNE mm(24x25088, 25088x96) 2025-09-07T09:58:03.9829502Z strides: [1, 24], [96, 1] 2025-09-07T09:58:03.9829753Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:03.9830098Z decompose_k_mm_49_split_0 0.0134 ms 100.0% k_split=49 2025-09-07T09:58:03.9830450Z decompose_k_mm_98_split_1 0.0142 ms 94.6% k_split=98 2025-09-07T09:58:03.9830782Z decompose_k_mm_16_split_8 0.0151 ms 89.2% k_split=16 2025-09-07T09:58:03.9831073Z mm 0.0151 ms 88.8% 2025-09-07T09:58:03.9831321Z decompose_k_mm_28_split_9 0.0153 ms 88.1% k_split=28 2025-09-07T09:58:03.9831665Z decompose_k_mm_14_split_7 0.0153 ms 87.7% k_split=14 2025-09-07T09:58:03.9831987Z decompose_k_mm_7_split_5 0.0157 ms 85.7% k_split=7 2025-09-07T09:58:03.9832321Z decompose_k_mm_8_split_6 0.0164 ms 81.7% k_split=8 2025-09-07T09:58:03.9832661Z decompose_k_mm_196_split_2 0.0182 ms 73.8% k_split=196 2025-09-07T09:58:03.9832995Z decompose_k_mm_4_split_4 0.0190 ms 70.8% k_split=4 2025-09-07T09:58:03.9833518Z SingleProcess AUTOTUNE benchmarking takes 2.2206 seconds and 0.0002 seconds precompiling for 28 choices 2025-09-07T09:58:05.1744000Z Autotune Choices Stats: 2025-09-07T09:58:05.1746736Z {"num_choices": 28, "num_triton_choices": 17, "best_kernel": "decompose_k_mm_49_split_10", "best_kernel_desc": "k_split=49", "best_time": 0.013055999763309956, "best_triton_pos": 11, "best_triton_time": 0.07705599814653397, "best_triton_kernel": "triton_mm_3329", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:58:05.2136984Z AUTOTUNE mm(96x25088, 25088x24) 2025-09-07T09:58:05.2137273Z strides: [1, 96], [24, 1] 2025-09-07T09:58:05.2137850Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:05.2138229Z decompose_k_mm_49_split_10 0.0131 ms 100.0% k_split=49 2025-09-07T09:58:05.2138597Z decompose_k_mm_98_split_11 0.0137 ms 95.3% k_split=98 2025-09-07T09:58:05.2138900Z mm 0.0150 ms 86.8% 2025-09-07T09:58:05.2139157Z decompose_k_mm_16_split_18 0.0150 ms 86.8% k_split=16 2025-09-07T09:58:05.2139496Z decompose_k_mm_28_split_19 0.0151 ms 86.6% k_split=28 2025-09-07T09:58:05.2139851Z decompose_k_mm_14_split_17 0.0151 ms 86.3% k_split=14 2025-09-07T09:58:05.2140188Z decompose_k_mm_7_split_15 0.0154 ms 84.8% k_split=7 2025-09-07T09:58:05.2140547Z decompose_k_mm_8_split_16 0.0162 ms 80.5% k_split=8 2025-09-07T09:58:05.2140906Z decompose_k_mm_4_split_14 0.0188 ms 69.3% k_split=4 2025-09-07T09:58:05.2141281Z decompose_k_mm_196_split_12 0.0209 ms 62.4% k_split=196 2025-09-07T09:58:05.2141924Z SingleProcess AUTOTUNE benchmarking takes 1.0453 seconds and 0.0002 seconds precompiling for 28 choices 2025-09-07T09:58:07.4970251Z Autotune Choices Stats: 2025-09-07T09:58:07.4971782Z {"num_choices": 25, "num_triton_choices": 14, "best_kernel": "decompose_k_mm_98_split_41", "best_kernel_desc": "k_split=98", "best_time": 0.012160000391304493, "best_triton_pos": 11, "best_triton_time": 0.0830719992518425, "best_triton_kernel": "triton_mm_3418", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:58:07.5374739Z AUTOTUNE mm(48x25088, 25088x24) 2025-09-07T09:58:07.5375840Z strides: [1, 48], [24, 1] 2025-09-07T09:58:07.5376103Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:07.5376444Z decompose_k_mm_98_split_41 0.0122 ms 100.0% k_split=98 2025-09-07T09:58:07.5376809Z decompose_k_mm_49_split_40 0.0124 ms 97.7% k_split=49 2025-09-07T09:58:07.5377174Z decompose_k_mm_28_split_49 0.0132 ms 91.8% k_split=28 2025-09-07T09:58:07.5377496Z mm 0.0138 ms 88.0% 2025-09-07T09:58:07.5377782Z decompose_k_mm_16_split_48 0.0140 ms 87.0% k_split=16 2025-09-07T09:58:07.5378131Z decompose_k_mm_14_split_47 0.0142 ms 85.4% k_split=14 2025-09-07T09:58:07.5378490Z decompose_k_mm_8_split_46 0.0172 ms 70.5% k_split=8 2025-09-07T09:58:07.5378888Z decompose_k_mm_196_split_42 0.0174 ms 69.9% k_split=196 2025-09-07T09:58:07.5379186Z decompose_k_mm_7_split_45 0.0177 ms 68.7% k_split=7 2025-09-07T09:58:07.5379469Z decompose_k_mm_4_split_44 0.0235 ms 51.7% k_split=4 2025-09-07T09:58:07.5379932Z SingleProcess AUTOTUNE benchmarking takes 2.1293 seconds and 0.0002 seconds precompiling for 25 choices 2025-09-07T09:58:09.5267914Z Autotune Choices Stats: 2025-09-07T09:58:09.5269610Z {"num_choices": 22, "num_triton_choices": 11, "best_kernel": "decompose_k_mm_98_split_31", "best_kernel_desc": "k_split=98", "best_time": 0.011680000461637974, "best_triton_pos": 11, "best_triton_time": 0.11635199934244156, "best_triton_kernel": "triton_mm_3391", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T09:58:09.5376652Z AUTOTUNE mm(24x25088, 25088x24) 2025-09-07T09:58:09.5376926Z strides: [1, 24], [24, 1] 2025-09-07T09:58:09.5377186Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:09.5377522Z decompose_k_mm_98_split_31 0.0117 ms 100.0% k_split=98 2025-09-07T09:58:09.5377881Z decompose_k_mm_49_split_30 0.0118 ms 99.2% k_split=49 2025-09-07T09:58:09.5378690Z decompose_k_mm_28_split_39 0.0137 ms 85.1% k_split=28 2025-09-07T09:58:09.5379045Z decompose_k_mm_196_split_32 0.0148 ms 78.8% k_split=196 2025-09-07T09:58:09.5379359Z mm 0.0159 ms 73.4% 2025-09-07T09:58:09.5379621Z decompose_k_mm_16_split_38 0.0163 ms 71.6% k_split=16 2025-09-07T09:58:09.5379962Z decompose_k_mm_14_split_37 0.0171 ms 68.4% k_split=14 2025-09-07T09:58:09.5380309Z decompose_k_mm_8_split_36 0.0217 ms 53.8% k_split=8 2025-09-07T09:58:09.5380648Z decompose_k_mm_7_split_35 0.0233 ms 50.1% k_split=7 2025-09-07T09:58:09.5381227Z decompose_k_mm_4_split_34 0.0335 ms 34.9% k_split=4 2025-09-07T09:58:09.5381865Z SingleProcess AUTOTUNE benchmarking takes 1.7461 seconds and 0.0002 seconds precompiling for 22 choices 2025-09-07T09:58:10.7608365Z Autotune Choices Stats: 2025-09-07T09:58:10.7609677Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "mm", "best_time": 0.00848000030964613, "best_triton_pos": 1, "best_triton_time": 0.008832000195980072, "best_triton_kernel": "triton_mm_2938", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2"} 2025-09-07T09:58:10.8507952Z AUTOTUNE mm(8x1000, 1000x384) 2025-09-07T09:58:10.8508232Z strides: [1000, 1], [384, 1] 2025-09-07T09:58:10.8508514Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:10.8508780Z mm 0.0085 ms 100.0% 2025-09-07T09:58:10.8509498Z triton_mm_2938 0.0088 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:58:10.8510587Z triton_mm_2942 0.0093 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:10.8511575Z triton_mm_2946 0.0097 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:10.8512585Z triton_mm_2937 0.0108 ms 78.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T09:58:10.8513557Z triton_mm_2936 0.0112 ms 75.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:10.8514533Z triton_mm_2950 0.0114 ms 74.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:10.8515887Z triton_mm_2941 0.0118 ms 71.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:10.8516856Z triton_mm_2948 0.0127 ms 66.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:10.8517844Z triton_mm_2945 0.0129 ms 65.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:10.8518696Z SingleProcess AUTOTUNE benchmarking takes 0.2725 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:58:11.0699201Z Autotune Choices Stats: 2025-09-07T09:58:11.0700454Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01152000017464161, "best_triton_pos": 1, "best_triton_time": 0.012959999963641167, "best_triton_kernel": "triton_mm_3017", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:58:11.1160048Z AUTOTUNE mm(1576x1536, 1536x384) 2025-09-07T09:58:11.1160692Z strides: [1536, 1], [384, 1] 2025-09-07T09:58:11.1160968Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:11.1161240Z mm 0.0115 ms 100.0% 2025-09-07T09:58:11.1161857Z triton_mm_3017 0.0130 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:11.1162870Z triton_mm_3023 0.0154 ms 74.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:11.1164092Z triton_mm_3013 0.0159 ms 72.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:11.1165430Z triton_mm_3012 0.0164 ms 70.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:11.1166454Z triton_mm_3016 0.0168 ms 68.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:11.1167445Z triton_mm_3022 0.0186 ms 62.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:11.1168430Z triton_mm_3015 0.0186 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:11.1169487Z triton_mm_3019 0.0188 ms 61.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:11.1170560Z triton_mm_3009 0.0193 ms 59.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:11.1171404Z SingleProcess AUTOTUNE benchmarking takes 0.2629 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:58:11.3068866Z Autotune Choices Stats: 2025-09-07T09:58:11.3070319Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.008832000195980072, "best_triton_pos": 1, "best_triton_time": 0.009440000168979168, "best_triton_kernel": "triton_mm_3050", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T09:58:11.3918527Z AUTOTUNE mm(1576x384, 384x384) 2025-09-07T09:58:11.3918775Z strides: [384, 1], [384, 1] 2025-09-07T09:58:11.3919018Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:11.3919259Z mm 0.0088 ms 100.0% 2025-09-07T09:58:11.3919927Z triton_mm_3050 0.0094 ms 93.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:11.3921046Z triton_mm_3055 0.0094 ms 93.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:11.3922074Z triton_mm_3054 0.0097 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:11.3923081Z triton_mm_3053 0.0100 ms 88.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:11.3924091Z triton_mm_3060 0.0100 ms 88.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:11.3925321Z triton_mm_3061 0.0100 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:11.3926676Z triton_mm_3057 0.0101 ms 87.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:11.3927677Z triton_mm_3051 0.0105 ms 84.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:11.3928861Z triton_mm_3044 0.0107 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:58:11.3929789Z SingleProcess AUTOTUNE benchmarking takes 0.2745 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:58:11.6087113Z Autotune Choices Stats: 2025-09-07T09:58:11.6088050Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_bmm_3093", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T09:58:11.9728467Z AUTOTUNE bmm(48x197x197, 48x197x64) 2025-09-07T09:58:11.9728803Z strides: [38848, 1, 197], [12608, 64, 1] 2025-09-07T09:58:11.9729107Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:11.9729818Z triton_bmm_3093 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:11.9730888Z triton_bmm_3089 0.0114 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:11.9731894Z triton_bmm_3092 0.0120 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:11.9732907Z triton_bmm_3088 0.0122 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:11.9733909Z triton_bmm_3098 0.0128 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:11.9734928Z triton_bmm_3083 0.0132 ms 85.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:11.9736615Z triton_bmm_3087 0.0132 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:58:11.9737609Z triton_bmm_3090 0.0133 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:11.9738616Z triton_bmm_3097 0.0134 ms 83.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:11.9739621Z triton_bmm_3091 0.0135 ms 83.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:11.9740410Z SingleProcess AUTOTUNE benchmarking takes 0.5794 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:58:12.1675784Z Autotune Choices Stats: 2025-09-07T09:58:12.1676762Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_bmm_3111", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.010432000271975994, "best_triton_pos": 0} 2025-09-07T09:58:12.2132575Z AUTOTUNE bmm(48x197x64, 48x64x197) 2025-09-07T09:58:12.2133034Z strides: [12608, 64, 1], [12608, 1, 64] 2025-09-07T09:58:12.2133482Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:12.2134539Z triton_bmm_3111 0.0104 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:12.2136639Z triton_bmm_3110 0.0105 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.2138651Z triton_bmm_3108 0.0107 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.2140301Z triton_bmm_3116 0.0111 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.2141167Z triton_bmm_3115 0.0111 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.2142108Z triton_bmm_3107 0.0115 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:12.2142954Z triton_bmm_3112 0.0117 ms 89.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.2143795Z triton_bmm_3105 0.0117 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:58:12.2144637Z triton_bmm_3106 0.0117 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:12.2145619Z triton_bmm_3100 0.0117 ms 88.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:58:12.2146356Z SingleProcess AUTOTUNE benchmarking takes 0.2360 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:58:12.4030169Z Autotune Choices Stats: 2025-09-07T09:58:12.4031291Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_bmm_3137", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.013120000250637531, "best_triton_pos": 0} 2025-09-07T09:58:12.4155331Z AUTOTUNE bmm(48x197x197, 48x197x64) 2025-09-07T09:58:12.4155598Z strides: [38809, 197, 1], [12608, 1, 197] 2025-09-07T09:58:12.4155880Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:12.4156532Z triton_bmm_3137 0.0131 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:12.4157538Z triton_bmm_3152 0.0132 ms 99.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:12.4158536Z triton_bmm_3138 0.0133 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:12.4159516Z triton_bmm_3144 0.0135 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.4160497Z triton_bmm_3146 0.0135 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.4161619Z triton_bmm_3142 0.0136 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:12.4162528Z triton_bmm_3141 0.0136 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:58:12.4163617Z triton_bmm_3148 0.0141 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.4164533Z triton_bmm_3150 0.0143 ms 91.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:58:12.4165606Z triton_bmm_3149 0.0144 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:12.4166408Z SingleProcess AUTOTUNE benchmarking takes 0.2018 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T09:58:12.5950936Z Autotune Choices Stats: 2025-09-07T09:58:12.5951933Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_bmm_3125", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.013120000250637531, "best_triton_pos": 0} 2025-09-07T09:58:12.6111869Z AUTOTUNE bmm(48x64x197, 48x197x197) 2025-09-07T09:58:12.6112153Z strides: [12608, 1, 64], [38809, 197, 1] 2025-09-07T09:58:12.6112428Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:12.6113073Z triton_bmm_3125 0.0131 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:12.6114068Z triton_bmm_3121 0.0134 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:12.6115309Z triton_bmm_3120 0.0139 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:12.6116280Z triton_bmm_3134 0.0140 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:12.6117271Z triton_bmm_3129 0.0141 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.6118238Z triton_bmm_3132 0.0144 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:12.6119204Z triton_bmm_3131 0.0145 ms 90.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.6120184Z triton_bmm_3127 0.0148 ms 88.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.6121136Z triton_bmm_3128 0.0149 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:12.6122049Z triton_bmm_3126 0.0158 ms 83.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:12.6122834Z SingleProcess AUTOTUNE benchmarking takes 0.1951 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:58:12.8106754Z Autotune Choices Stats: 2025-09-07T09:58:12.8108023Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.009440000168979168, "best_triton_pos": 1, "best_triton_time": 0.010239999741315842, "best_triton_kernel": "triton_mm_3222", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:58:12.9182411Z AUTOTUNE mm(1576x768, 768x384) 2025-09-07T09:58:12.9182841Z strides: [768, 1], [384, 1] 2025-09-07T09:58:12.9183625Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:12.9184079Z mm 0.0094 ms 100.0% 2025-09-07T09:58:12.9185575Z triton_mm_3222 0.0102 ms 92.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:12.9187260Z triton_mm_3217 0.0117 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:12.9188872Z triton_mm_3221 0.0117 ms 80.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.9190536Z triton_mm_3228 0.0117 ms 80.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:12.9191515Z triton_mm_3218 0.0123 ms 76.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:12.9192483Z triton_mm_3220 0.0126 ms 74.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:12.9193450Z triton_mm_3227 0.0126 ms 74.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:12.9194449Z triton_mm_3224 0.0129 ms 73.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:12.9195552Z triton_mm_3211 0.0140 ms 67.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:58:12.9196401Z SingleProcess AUTOTUNE benchmarking takes 0.3063 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:58:13.1102971Z Autotune Choices Stats: 2025-09-07T09:58:13.1104216Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.008704000152647495, "best_triton_pos": 1, "best_triton_time": 0.00940799992531538, "best_triton_kernel": "triton_mm_3241", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T09:58:13.1307469Z AUTOTUNE mm(1568x384, 384x384) 2025-09-07T09:58:13.1307729Z strides: [384, 1], [384, 1] 2025-09-07T09:58:13.1307982Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:13.1308252Z mm 0.0087 ms 100.0% 2025-09-07T09:58:13.1308877Z triton_mm_3241 0.0094 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:13.1309967Z triton_mm_3240 0.0095 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:13.1311062Z triton_mm_3236 0.0096 ms 91.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:13.1312272Z triton_mm_3239 0.0098 ms 88.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:13.1313246Z triton_mm_3243 0.0102 ms 85.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:13.1314214Z triton_mm_3237 0.0105 ms 83.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:13.1315513Z triton_mm_3247 0.0105 ms 83.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:13.1316511Z triton_mm_3246 0.0105 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:13.1317496Z triton_mm_3230 0.0106 ms 82.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:58:13.1318349Z SingleProcess AUTOTUNE benchmarking takes 0.2110 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T09:58:13.3158446Z Autotune Choices Stats: 2025-09-07T09:58:13.3159416Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_3304", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.009247999638319016, "best_triton_pos": 0} 2025-09-07T09:58:13.3480366Z AUTOTUNE mm(25088x96, 96x24) 2025-09-07T09:58:13.3480687Z strides: [96, 1], [24, 1] 2025-09-07T09:58:13.3480985Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:13.3481698Z triton_mm_3304 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:13.3482719Z triton_mm_3310 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:13.3483730Z triton_mm_3314 0.0095 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:13.3484728Z triton_mm_3305 0.0096 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:13.3485897Z triton_mm_3302 0.0096 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:58:13.3486909Z triton_mm_3312 0.0097 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:13.3487898Z triton_mm_3315 0.0099 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:58:13.3488879Z triton_mm_3301 0.0100 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T09:58:13.3489880Z triton_mm_3306 0.0104 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:58:13.3490534Z mm 0.0104 ms 88.7% 2025-09-07T09:58:13.3490948Z SingleProcess AUTOTUNE benchmarking takes 0.2041 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T09:58:13.5053735Z Autotune Choices Stats: 2025-09-07T09:58:13.5055321Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_mm_3337", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.006943999789655209, "best_triton_pos": 0} 2025-09-07T09:58:13.5538693Z AUTOTUNE mm(25088x24, 24x24) 2025-09-07T09:58:13.5538953Z strides: [24, 1], [24, 1] 2025-09-07T09:58:13.5539207Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:13.5540148Z triton_mm_3337 0.0069 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:13.5541173Z triton_mm_3338 0.0072 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:13.5542279Z triton_mm_3349 0.0072 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:13.5543270Z triton_mm_3341 0.0073 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:58:13.5544245Z triton_mm_3339 0.0073 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:13.5545389Z triton_mm_3344 0.0073 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:13.5546388Z triton_mm_3346 0.0073 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:13.5547405Z triton_mm_3347 0.0073 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:13.5548393Z triton_mm_3343 0.0074 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:13.5549364Z triton_mm_3345 0.0074 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:13.5550257Z SingleProcess AUTOTUNE benchmarking takes 0.1948 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T09:58:13.6243171Z Autotune Choices Stats: 2025-09-07T09:58:13.6244159Z {"num_choices": 6, "num_triton_choices": 5, "best_kernel": "triton_bmm_3363", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1", "best_time": 0.009920000098645687, "best_triton_pos": 0} 2025-09-07T09:58:13.6500946Z AUTOTUNE bmm(6272x16x16, 6272x16x6) 2025-09-07T09:58:13.6501236Z strides: [256, 1, 16], [96, 6, 1] 2025-09-07T09:58:13.6501556Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:13.6502232Z triton_bmm_3363 0.0099 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:58:13.6503268Z triton_bmm_3362 0.0100 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:58:13.6504255Z triton_bmm_3364 0.0100 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:58:13.6505387Z triton_bmm_3361 0.0100 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=1 2025-09-07T09:58:13.6506722Z triton_bmm_3365 0.0100 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:58:13.6507362Z bmm 0.0117 ms 84.5% 2025-09-07T09:58:13.6507817Z SingleProcess AUTOTUNE benchmarking takes 0.0857 seconds and 0.0001 seconds precompiling for 6 choices 2025-09-07T09:58:13.7108636Z Autotune Choices Stats: 2025-09-07T09:58:13.7110665Z {"num_choices": 6, "num_triton_choices": 5, "best_kernel": "triton_bmm_3370", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1", "best_time": 0.009664000011980534, "best_triton_pos": 0} 2025-09-07T09:58:13.8025440Z AUTOTUNE bmm(6272x16x6, 6272x6x16) 2025-09-07T09:58:13.8025890Z strides: [96, 6, 1], [96, 1, 6] 2025-09-07T09:58:13.8026312Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:13.8027359Z triton_bmm_3370 0.0097 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:58:13.8028964Z triton_bmm_3366 0.0097 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=1 2025-09-07T09:58:13.8030715Z triton_bmm_3367 0.0097 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:58:13.8031844Z triton_bmm_3368 0.0097 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:58:13.8032822Z triton_bmm_3369 0.0098 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:58:13.8033433Z bmm 0.0113 ms 85.3% 2025-09-07T09:58:13.8033866Z SingleProcess AUTOTUNE benchmarking takes 0.1519 seconds and 0.0002 seconds precompiling for 6 choices 2025-09-07T09:58:13.8628885Z Autotune Choices Stats: 2025-09-07T09:58:13.8630758Z {"num_choices": 6, "num_triton_choices": 5, "best_kernel": "triton_bmm_3373", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1", "best_time": 0.00979200005531311, "best_triton_pos": 0} 2025-09-07T09:58:13.9000308Z AUTOTUNE bmm(6272x6x16, 6272x16x16) 2025-09-07T09:58:13.9000809Z strides: [96, 1, 6], [256, 16, 1] 2025-09-07T09:58:13.9001208Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:13.9002160Z triton_bmm_3373 0.0098 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:58:13.9003547Z triton_bmm_3371 0.0098 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=1 2025-09-07T09:58:13.9004925Z triton_bmm_3374 0.0098 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:58:13.9006630Z triton_bmm_3375 0.0098 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:58:13.9008078Z triton_bmm_3372 0.0099 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:58:13.9008993Z bmm 0.0118 ms 83.2% 2025-09-07T09:58:13.9009643Z SingleProcess AUTOTUNE benchmarking takes 0.0965 seconds and 0.0002 seconds precompiling for 6 choices 2025-09-07T09:58:14.0006631Z Autotune Choices Stats: 2025-09-07T09:58:14.0008180Z {"num_choices": 6, "num_triton_choices": 5, "best_kernel": "triton_bmm_3378", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1", "best_time": 0.009727999567985535, "best_triton_pos": 0} 2025-09-07T09:58:14.0706866Z AUTOTUNE bmm(6272x16x16, 6272x16x6) 2025-09-07T09:58:14.0707384Z strides: [256, 16, 1], [96, 1, 16] 2025-09-07T09:58:14.0708463Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:14.0709475Z triton_bmm_3378 0.0097 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T09:58:14.0710929Z triton_bmm_3377 0.0100 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T09:58:14.0712323Z triton_bmm_3380 0.0100 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T09:58:14.0713816Z triton_bmm_3376 0.0100 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=1 2025-09-07T09:58:14.0715592Z triton_bmm_3379 0.0100 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T09:58:14.0716507Z bmm 0.1153 ms 8.4% 2025-09-07T09:58:14.0717170Z SingleProcess AUTOTUNE benchmarking takes 0.1694 seconds and 0.0004 seconds precompiling for 6 choices 2025-09-07T09:58:14.3884475Z Autotune Choices Stats: 2025-09-07T09:58:14.3886212Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_3422", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.007903999648988247, "best_triton_pos": 0} 2025-09-07T09:58:14.5394800Z AUTOTUNE mm(25088x48, 48x24) 2025-09-07T09:58:14.5395579Z strides: [48, 1], [24, 1] 2025-09-07T09:58:14.5395942Z dtypes: torch.float16, torch.float16 2025-09-07T09:58:14.5396905Z triton_mm_3422 0.0079 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T09:58:14.5398370Z triton_mm_3431 0.0080 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:14.5399810Z triton_mm_3432 0.0080 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T09:58:14.5401324Z triton_mm_3428 0.0080 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T09:58:14.5402796Z triton_mm_3430 0.0081 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T09:58:14.5404286Z triton_mm_3436 0.0082 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T09:58:14.5406079Z triton_mm_3437 0.0082 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:14.5407583Z triton_mm_3425 0.0082 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T09:58:14.5409713Z triton_mm_3435 0.0083 ms 95.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T09:58:14.5411199Z triton_mm_3424 0.0083 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T09:58:14.5412850Z SingleProcess AUTOTUNE benchmarking takes 0.4265 seconds and 0.0004 seconds precompiling for 18 choices 2025-09-07T09:58:28.5665926Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put_.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/timm_models.py", line 442, in torch_dynamo_resume_in_forward_and_backward_pass_at_440 2025-09-07T09:58:28.5667026Z pred = mod(*cloned_inputs) 2025-09-07T09:58:28.5667520Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/tnt.py", line 335, in forward 2025-09-07T09:58:28.5668043Z x = self.forward_features(x) 2025-09-07T09:58:28.5668564Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/tnt.py", line 311, in forward_features 2025-09-07T09:58:28.5669136Z pixel_embed = self.pixel_embed(x, self.pixel_pos) 2025-09-07T09:58:28.5669686Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/tnt.py", line 183, in forward 2025-09-07T09:58:28.5670158Z x = self.unfold(x) 2025-09-07T09:58:28.5670297Z 2025-09-07T09:58:28.5670302Z 2025-09-07T09:58:29.9288421Z W0907 09:58:29.927000 119280 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T09:59:23.0902888Z pass 2025-09-07T09:59:32.4258065Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T09:59:32.4259454Z import pynvml # type: ignore[import] 2025-09-07T09:59:36.0170527Z 2025-09-07T09:59:39.2439844Z loading model: 0it [00:00, ?it/s] 2025-09-07T09:59:39.2440177Z loading model: 0it [00:03, ?it/s] 2025-09-07T09:59:39.2440456Z cuda train twins_pcpvt_base 2025-09-07T10:00:23.0544494Z Autotune Choices Stats: 2025-09-07T10:00:23.0545957Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_79", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.019231999292969704, "best_triton_pos": 0} 2025-09-07T10:00:23.0651596Z AUTOTUNE addmm(25088x512, 25088x64, 64x512) 2025-09-07T10:00:23.0651922Z strides: [0, 1], [64, 1], [1, 64] 2025-09-07T10:00:23.0652229Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:23.0652953Z triton_mm_79 0.0192 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.0654007Z triton_mm_80 0.0193 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:23.0655217Z triton_mm_76 0.0195 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:23.0656197Z triton_mm_86 0.0197 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:23.0657212Z triton_mm_81 0.0210 ms 91.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.0658555Z triton_mm_77 0.0211 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.0659445Z triton_mm_85 0.0212 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.0660563Z triton_mm_78 0.0216 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:23.0661453Z triton_mm_84 0.0218 ms 88.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.0662442Z triton_mm_82 0.0218 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:23.0663229Z SingleProcess AUTOTUNE benchmarking takes 0.2759 seconds and 0.0004 seconds precompiling for 21 choices 2025-09-07T10:00:23.3965531Z Autotune Choices Stats: 2025-09-07T10:00:23.3966826Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.013279999606311321, "best_triton_pos": 1, "best_triton_time": 0.015263999812304974, "best_triton_kernel": "triton_mm_385", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:00:23.4530595Z AUTOTUNE addmm(6272x1024, 6272x128, 128x1024) 2025-09-07T10:00:23.4530907Z strides: [0, 1], [128, 1], [1, 128] 2025-09-07T10:00:23.4531226Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:23.4531550Z bias_addmm 0.0133 ms 100.0% 2025-09-07T10:00:23.4532162Z triton_mm_385 0.0153 ms 87.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.4533161Z triton_mm_390 0.0153 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.4534142Z triton_mm_383 0.0157 ms 84.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.4535309Z triton_mm_387 0.0157 ms 84.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.4536302Z triton_mm_391 0.0159 ms 83.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.4537354Z triton_mm_388 0.0161 ms 82.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:23.4538233Z triton_mm_386 0.0161 ms 82.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:23.4539122Z triton_mm_384 0.0164 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:23.4540004Z triton_mm_380 0.0166 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:23.4540790Z SingleProcess AUTOTUNE benchmarking takes 0.3130 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:00:23.8259636Z Autotune Choices Stats: 2025-09-07T10:00:23.8261445Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.01104000024497509, "best_triton_pos": 1, "best_triton_time": 0.011168000288307667, "best_triton_kernel": "triton_mm_800", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:00:23.8555807Z AUTOTUNE addmm(1568x1280, 1568x320, 320x1280) 2025-09-07T10:00:23.8556120Z strides: [0, 1], [320, 1], [1, 320] 2025-09-07T10:00:23.8556410Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:23.8557057Z bias_addmm 0.0110 ms 100.0% 2025-09-07T10:00:23.8557825Z triton_mm_800 0.0112 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.8558869Z triton_mm_806 0.0112 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.8559853Z triton_mm_807 0.0117 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:23.8560804Z triton_mm_799 0.0118 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:23.8561792Z triton_mm_803 0.0119 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:23.8562774Z triton_mm_796 0.0120 ms 91.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:23.8563735Z triton_mm_805 0.0121 ms 91.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.8564709Z triton_mm_798 0.0124 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.8565870Z triton_mm_802 0.0126 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:23.8566747Z SingleProcess AUTOTUNE benchmarking takes 0.3080 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:00:24.6855240Z Autotune Choices Stats: 2025-09-07T10:00:24.6856345Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_98", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.018783999606966972, "best_triton_pos": 0} 2025-09-07T10:00:24.9236146Z AUTOTUNE mm(25088x512, 512x64) 2025-09-07T10:00:24.9236425Z strides: [512, 1], [1, 512] 2025-09-07T10:00:24.9236692Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:24.9237342Z triton_mm_98 0.0188 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:24.9238333Z triton_mm_94 0.0190 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:24.9239271Z triton_mm_103 0.0194 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:24.9239843Z mm 0.0199 ms 94.5% 2025-09-07T10:00:24.9240374Z triton_mm_104 0.0210 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:24.9241602Z triton_mm_95 0.0210 ms 89.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:24.9242506Z triton_mm_99 0.0212 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:24.9243655Z triton_mm_101 0.0226 ms 83.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:24.9244557Z triton_mm_97 0.0227 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:24.9245608Z triton_mm_88 0.0228 ms 82.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:24.9246401Z SingleProcess AUTOTUNE benchmarking takes 0.4602 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:00:25.7401909Z Autotune Choices Stats: 2025-09-07T10:00:25.7402998Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_318", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.00848000030964613, "best_triton_pos": 0} 2025-09-07T10:00:26.0352603Z AUTOTUNE addmm(6272x128, 6272x128, 128x128) 2025-09-07T10:00:26.0352948Z strides: [0, 1], [128, 1], [1, 128] 2025-09-07T10:00:26.0353260Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:26.0353976Z triton_mm_318 0.0085 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:26.0355325Z triton_mm_321 0.0086 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:26.0355970Z bias_addmm 0.0086 ms 98.1% 2025-09-07T10:00:26.0356567Z triton_mm_322 0.0088 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:26.0357540Z triton_mm_317 0.0088 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:26.0358503Z triton_mm_320 0.0089 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:26.0359410Z triton_mm_324 0.0089 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:26.0360306Z triton_mm_323 0.0089 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:26.0361201Z triton_mm_319 0.0090 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:26.0362110Z triton_mm_311 0.0092 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:26.0362898Z SingleProcess AUTOTUNE benchmarking takes 0.5541 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:00:26.9417916Z Autotune Choices Stats: 2025-09-07T10:00:26.9419270Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.013663999736309052, "best_triton_pos": 1, "best_triton_time": 0.013728000223636627, "best_triton_kernel": "triton_mm_405", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T10:00:26.9728187Z AUTOTUNE mm(6272x1024, 1024x128) 2025-09-07T10:00:26.9728477Z strides: [1024, 1], [1, 1024] 2025-09-07T10:00:26.9728741Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:26.9729014Z mm 0.0137 ms 100.0% 2025-09-07T10:00:26.9730005Z triton_mm_405 0.0137 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:26.9731035Z triton_mm_411 0.0154 ms 89.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:26.9732034Z triton_mm_404 0.0159 ms 86.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:26.9733003Z triton_mm_400 0.0159 ms 85.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:26.9733964Z triton_mm_401 0.0161 ms 84.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:26.9735371Z triton_mm_410 0.0166 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:26.9736354Z triton_mm_403 0.0184 ms 74.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:26.9737347Z triton_mm_394 0.0184 ms 74.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:26.9738374Z triton_mm_407 0.0184 ms 74.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:26.9739105Z SingleProcess AUTOTUNE benchmarking takes 0.2726 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:00:27.8777024Z Autotune Choices Stats: 2025-09-07T10:00:27.8778444Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.01027199998497963, "best_triton_pos": 1, "best_triton_time": 0.010495999827980995, "best_triton_kernel": "triton_mm_2637", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T10:00:28.0942705Z AUTOTUNE addmm(392x2048, 392x512, 512x2048) 2025-09-07T10:00:28.0943030Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T10:00:28.0943354Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:28.0943701Z bias_addmm 0.0103 ms 100.0% 2025-09-07T10:00:28.0944347Z triton_mm_2637 0.0105 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:28.0945531Z triton_mm_2636 0.0107 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:28.0946552Z triton_mm_2632 0.0108 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:28.0947566Z triton_mm_2633 0.0119 ms 86.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:28.0949086Z triton_mm_2635 0.0122 ms 84.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:28.0950101Z triton_mm_2639 0.0124 ms 83.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:28.0951349Z triton_mm_2643 0.0126 ms 81.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:28.0952373Z triton_mm_2642 0.0127 ms 81.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:28.0953383Z triton_mm_2626 0.0136 ms 75.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:28.0954263Z SingleProcess AUTOTUNE benchmarking takes 0.4875 seconds and 0.0003 seconds precompiling for 21 choices 2025-09-07T10:00:28.6614032Z Autotune Choices Stats: 2025-09-07T10:00:28.6615464Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_733", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008383999578654766, "best_triton_pos": 0} 2025-09-07T10:00:28.7991980Z AUTOTUNE addmm(1568x320, 1568x320, 320x320) 2025-09-07T10:00:28.7992290Z strides: [0, 1], [320, 1], [1, 320] 2025-09-07T10:00:28.7992595Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:28.7993286Z triton_mm_733 0.0084 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:28.7994312Z triton_mm_732 0.0088 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:28.7995238Z bias_addmm 0.0090 ms 93.6% 2025-09-07T10:00:28.7995828Z triton_mm_728 0.0093 ms 89.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:28.7996800Z triton_mm_736 0.0096 ms 87.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:28.7997764Z triton_mm_735 0.0097 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:28.7998729Z triton_mm_727 0.0097 ms 86.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:28.7999691Z triton_mm_737 0.0097 ms 86.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:28.8000596Z triton_mm_739 0.0099 ms 84.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:28.8001501Z triton_mm_726 0.0103 ms 81.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:28.8002285Z SingleProcess AUTOTUNE benchmarking takes 0.4122 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:00:29.6610333Z Autotune Choices Stats: 2025-09-07T10:00:29.6612211Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010623999871313572, "best_triton_pos": 1, "best_triton_time": 0.010879999957978725, "best_triton_kernel": "triton_mm_816", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:00:29.7398219Z AUTOTUNE mm(1568x1280, 1280x320) 2025-09-07T10:00:29.7398507Z strides: [1280, 1], [1, 1280] 2025-09-07T10:00:29.7398829Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:29.7399149Z mm 0.0106 ms 100.0% 2025-09-07T10:00:29.7400383Z triton_mm_816 0.0109 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:29.7401403Z triton_mm_820 0.0123 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:29.7402406Z triton_mm_812 0.0139 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:29.7403384Z triton_mm_826 0.0141 ms 75.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:29.7404376Z triton_mm_815 0.0144 ms 73.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:29.7405820Z triton_mm_819 0.0151 ms 70.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:29.7406792Z triton_mm_811 0.0162 ms 65.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:29.7407776Z triton_mm_809 0.0166 ms 64.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:29.7408823Z triton_mm_825 0.0169 ms 63.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:29.7409664Z SingleProcess AUTOTUNE benchmarking takes 0.3201 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:00:31.2100505Z Autotune Choices Stats: 2025-09-07T10:00:31.2102175Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_2572", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.0080960001796484, "best_triton_pos": 0} 2025-09-07T10:00:31.4187736Z AUTOTUNE addmm(392x512, 392x512, 512x512) 2025-09-07T10:00:31.4188303Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T10:00:31.4188774Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:31.4189854Z triton_mm_2572 0.0081 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:31.4191327Z triton_mm_2576 0.0084 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:31.4192196Z bias_addmm 0.0089 ms 91.3% 2025-09-07T10:00:31.4193001Z triton_mm_2571 0.0092 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:31.4194301Z triton_mm_2580 0.0093 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:31.4196430Z triton_mm_2570 0.0095 ms 85.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:31.4197697Z triton_mm_2575 0.0097 ms 83.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:31.4199305Z triton_mm_2569 0.0098 ms 82.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:31.4200604Z triton_mm_2579 0.0100 ms 80.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:31.4201915Z triton_mm_2578 0.0105 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:31.4203036Z SingleProcess AUTOTUNE benchmarking takes 0.7366 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:00:32.4570816Z Autotune Choices Stats: 2025-09-07T10:00:32.4572619Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_2648", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.01119999960064888, "best_triton_pos": 0} 2025-09-07T10:00:32.6036679Z AUTOTUNE mm(392x2048, 2048x512) 2025-09-07T10:00:32.6037024Z strides: [2048, 1], [1, 2048] 2025-09-07T10:00:32.6037291Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:32.6037966Z triton_mm_2648 0.0112 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:32.6038641Z mm 0.0116 ms 97.0% 2025-09-07T10:00:32.6039222Z triton_mm_2652 0.0121 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:32.6040230Z triton_mm_2656 0.0134 ms 83.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:32.6041183Z triton_mm_2662 0.0181 ms 61.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:32.6042104Z triton_mm_2647 0.0187 ms 59.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:32.6043009Z triton_mm_2646 0.0187 ms 59.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:32.6043907Z triton_mm_2651 0.0193 ms 57.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:32.6044811Z triton_mm_2655 0.0197 ms 56.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:32.6045888Z triton_mm_2645 0.0199 ms 56.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:32.6046700Z SingleProcess AUTOTUNE benchmarking takes 0.4129 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:00:33.2597571Z Autotune Choices Stats: 2025-09-07T10:00:33.2599018Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_369", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T10:00:33.3074816Z AUTOTUNE mm(6272x128, 128x128) 2025-09-07T10:00:33.3075377Z strides: [128, 1], [1, 128] 2025-09-07T10:00:33.3075661Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:33.3076332Z triton_mm_369 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:33.3077418Z mm 0.0083 ms 99.2% 2025-09-07T10:00:33.3078029Z triton_mm_363 0.0083 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:33.3079003Z triton_mm_368 0.0083 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:33.3079987Z triton_mm_362 0.0083 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:33.3080928Z triton_mm_364 0.0084 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:33.3081915Z triton_mm_357 0.0088 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:33.3082879Z triton_mm_372 0.0088 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:33.3083847Z triton_mm_373 0.0088 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:33.3084814Z triton_mm_361 0.0088 ms 92.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:33.3085829Z SingleProcess AUTOTUNE benchmarking takes 0.3531 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:00:33.8165739Z Autotune Choices Stats: 2025-09-07T10:00:33.8166854Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_778", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.00825599953532219, "best_triton_pos": 0} 2025-09-07T10:00:34.0136384Z AUTOTUNE mm(1568x320, 320x320) 2025-09-07T10:00:34.0136829Z strides: [320, 1], [1, 320] 2025-09-07T10:00:34.0137232Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:34.0138272Z triton_mm_778 0.0083 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:34.0139265Z mm 0.0084 ms 98.1% 2025-09-07T10:00:34.0140172Z triton_mm_777 0.0085 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:34.0141076Z triton_mm_781 0.0088 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:34.0141981Z triton_mm_780 0.0090 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:34.0142816Z triton_mm_782 0.0091 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:34.0144028Z triton_mm_784 0.0091 ms 90.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:34.0144865Z triton_mm_773 0.0092 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:34.0146242Z triton_mm_787 0.0095 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:34.0147077Z triton_mm_772 0.0096 ms 86.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:34.0147802Z SingleProcess AUTOTUNE benchmarking takes 0.4198 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:00:35.1163936Z Autotune Choices Stats: 2025-09-07T10:00:35.1165445Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_2610", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.007968000136315823, "best_triton_pos": 0} 2025-09-07T10:00:35.3519856Z AUTOTUNE mm(392x512, 512x512) 2025-09-07T10:00:35.3520156Z strides: [512, 1], [1, 512] 2025-09-07T10:00:35.3520418Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:35.3521097Z triton_mm_2610 0.0080 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:35.3521692Z mm 0.0085 ms 93.3% 2025-09-07T10:00:35.3522213Z triton_mm_2609 0.0091 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:35.3523133Z triton_mm_2624 0.0101 ms 79.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:35.3524031Z triton_mm_2623 0.0106 ms 75.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:35.3524929Z triton_mm_2622 0.0129 ms 61.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:35.3526187Z triton_mm_2612 0.0160 ms 49.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:35.3527086Z triton_mm_2620 0.0176 ms 45.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:35.3527976Z triton_mm_2608 0.0178 ms 44.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:35.3528852Z triton_mm_2606 0.0190 ms 41.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:00:35.3529640Z SingleProcess AUTOTUNE benchmarking takes 0.6042 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T10:00:39.7774714Z Autotune Choices Stats: 2025-09-07T10:00:39.7775988Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_0", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.015072000212967396, "best_triton_pos": 0} 2025-09-07T10:00:39.8223999Z AUTOTUNE convolution(8x3x224x224, 64x3x4x4) 2025-09-07T10:00:39.8224338Z strides: [150528, 1, 672, 3], [48, 1, 12, 3] 2025-09-07T10:00:39.8224632Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:39.8225600Z triton_convolution2d_0 0.0151 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:39.8227196Z triton_convolution2d_4 0.0152 ms 98.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:39.8227960Z convolution 0.0161 ms 93.5% 2025-09-07T10:00:39.8228686Z triton_convolution2d_3 0.0161 ms 93.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:39.8229899Z triton_convolution2d_5 0.0195 ms 77.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:39.8231134Z triton_convolution2d_2 0.0226 ms 66.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:00:39.8232525Z triton_convolution2d_1 0.0473 ms 31.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:39.8233490Z SingleProcess AUTOTUNE benchmarking takes 0.1444 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T10:00:40.0807715Z Autotune Choices Stats: 2025-09-07T10:00:40.0808710Z {"num_choices": 20, "num_triton_choices": 18, "best_kernel": "triton_mm_18", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.009151999838650227, "best_triton_pos": 0} 2025-09-07T10:00:40.1025277Z AUTOTUNE addmm(25088x64, 25088x64, 64x64) 2025-09-07T10:00:40.1025613Z strides: [0, 1], [64, 1], [1, 64] 2025-09-07T10:00:40.1025926Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:40.1026637Z triton_mm_18 0.0092 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:40.1027607Z triton_mm_16 0.0092 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:40.1028553Z triton_mm_17 0.0093 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:40.1029515Z triton_mm_10 0.0094 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:40.1030466Z triton_mm_23 0.0094 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:40.1031475Z triton_mm_22 0.0095 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:40.1032576Z triton_mm_14 0.0095 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:40.1033519Z triton_mm_19 0.0096 ms 95.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:40.1034704Z triton_mm_15 0.0096 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:40.1035788Z triton_mm_12 0.0097 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:40.1036798Z SingleProcess AUTOTUNE benchmarking takes 0.2794 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:00:40.2852268Z Autotune Choices Stats: 2025-09-07T10:00:40.2853622Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.022752000018954277, "best_triton_pos": 1, "best_triton_time": 0.09305600076913834, "best_triton_kernel": "triton_convolution2d_29", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=8, KERNEL_W=8, PADDING_H=0, PADDING_W=0, STRIDE_H=8, STRIDE_W=8, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:00:40.3173449Z AUTOTUNE convolution(8x64x56x56, 64x64x8x8) 2025-09-07T10:00:40.3173800Z strides: [200704, 1, 3584, 64], [4096, 1, 512, 64] 2025-09-07T10:00:40.3174131Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:40.3174400Z convolution 0.0228 ms 100.0% 2025-09-07T10:00:40.3175282Z triton_convolution2d_29 0.0931 ms 24.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=8, KERNEL_W=8, PADDING_H=0, PADDING_W=0, STRIDE_H=8, STRIDE_W=8, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:40.3176494Z triton_convolution2d_28 0.0946 ms 24.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=8, KERNEL_W=8, PADDING_H=0, PADDING_W=0, STRIDE_H=8, STRIDE_W=8, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:40.3177696Z triton_convolution2d_27 0.1055 ms 21.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=8, KERNEL_W=8, PADDING_H=0, PADDING_W=0, STRIDE_H=8, STRIDE_W=8, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:40.3178921Z triton_convolution2d_30 0.1206 ms 18.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=8, KERNEL_W=8, PADDING_H=0, PADDING_W=0, STRIDE_H=8, STRIDE_W=8, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:40.3180117Z triton_convolution2d_24 0.1507 ms 15.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=8, KERNEL_W=8, PADDING_H=0, PADDING_W=0, STRIDE_H=8, STRIDE_W=8, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:40.3181310Z triton_convolution2d_25 0.1957 ms 11.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=8, KERNEL_W=8, PADDING_H=0, PADDING_W=0, STRIDE_H=8, STRIDE_W=8, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:40.3182655Z triton_convolution2d_26 0.2666 ms 8.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=8, KERNEL_W=8, PADDING_H=0, PADDING_W=0, STRIDE_H=8, STRIDE_W=8, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:00:40.3183489Z SingleProcess AUTOTUNE benchmarking takes 0.2142 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:00:40.5842313Z Autotune Choices Stats: 2025-09-07T10:00:40.5843277Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_32", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.006432000081986189, "best_triton_pos": 0} 2025-09-07T10:00:40.6275670Z AUTOTUNE addmm(392x128, 392x64, 64x128) 2025-09-07T10:00:40.6275937Z strides: [0, 1], [64, 1], [1, 64] 2025-09-07T10:00:40.6276227Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:40.6276896Z triton_mm_32 0.0064 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:40.6278082Z triton_mm_35 0.0068 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:40.6279028Z triton_mm_34 0.0068 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:40.6280137Z triton_mm_39 0.0068 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:40.6281082Z triton_mm_33 0.0069 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:40.6282013Z triton_mm_38 0.0069 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:40.6282912Z triton_mm_42 0.0069 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:40.6283783Z triton_mm_43 0.0070 ms 92.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:40.6284656Z triton_mm_44 0.0073 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:40.6285516Z bias_addmm 0.0074 ms 87.4% 2025-09-07T10:00:40.6285958Z SingleProcess AUTOTUNE benchmarking takes 0.3084 seconds and 0.0003 seconds precompiling for 21 choices 2025-09-07T10:00:40.8481590Z Autotune Choices Stats: 2025-09-07T10:00:40.8482777Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_57", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.008511999621987343, "best_triton_pos": 0} 2025-09-07T10:00:40.8827916Z AUTOTUNE mm(25088x64, 64x64) 2025-09-07T10:00:40.8828192Z strides: [64, 1], [1, 64] 2025-09-07T10:00:40.8828450Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:40.8829123Z triton_mm_57 0.0085 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:40.8830106Z triton_mm_62 0.0086 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:40.8831049Z triton_mm_61 0.0087 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:40.8832117Z triton_mm_58 0.0087 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:40.8833088Z triton_mm_67 0.0088 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:40.8834030Z triton_mm_54 0.0089 ms 95.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:40.8835624Z triton_mm_64 0.0090 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:40.8836601Z triton_mm_63 0.0091 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:40.8837805Z triton_mm_66 0.0091 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:40.8838763Z triton_mm_51 0.0092 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:40.8839767Z SingleProcess AUTOTUNE benchmarking takes 0.2546 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:00:41.0142594Z Autotune Choices Stats: 2025-09-07T10:00:41.0143763Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_307", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.01206399966031313, "best_triton_pos": 0} 2025-09-07T10:00:41.2945855Z AUTOTUNE convolution(8x64x56x56, 128x64x2x2) 2025-09-07T10:00:41.2946204Z strides: [200704, 1, 3584, 64], [256, 1, 128, 64] 2025-09-07T10:00:41.2946546Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:41.2947335Z triton_convolution2d_307 0.0121 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:41.2948130Z convolution 0.0127 ms 94.7% 2025-09-07T10:00:41.2948893Z triton_convolution2d_308 0.0131 ms 92.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:41.2950118Z triton_convolution2d_306 0.0137 ms 88.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:41.2951339Z triton_convolution2d_309 0.0141 ms 85.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:41.2952741Z triton_convolution2d_303 0.0157 ms 76.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:41.2953966Z triton_convolution2d_304 0.0184 ms 65.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:41.2955384Z triton_convolution2d_305 0.0298 ms 40.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:00:41.2956341Z SingleProcess AUTOTUNE benchmarking takes 0.3808 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:00:41.4290327Z Autotune Choices Stats: 2025-09-07T10:00:41.4291682Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.015359999611973763, "best_triton_pos": 1, "best_triton_time": 0.04790399968624115, "best_triton_kernel": "triton_convolution2d_333", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:00:41.4622072Z AUTOTUNE convolution(8x128x28x28, 128x128x4x4) 2025-09-07T10:00:41.4622445Z strides: [100352, 1, 3584, 128], [2048, 1, 512, 128] 2025-09-07T10:00:41.4622775Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:41.4623064Z convolution 0.0154 ms 100.0% 2025-09-07T10:00:41.4623802Z triton_convolution2d_333 0.0479 ms 32.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:41.4625692Z triton_convolution2d_334 0.0560 ms 27.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:41.4627200Z triton_convolution2d_332 0.0610 ms 25.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:41.4628431Z triton_convolution2d_335 0.0627 ms 24.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:41.4629648Z triton_convolution2d_329 0.0783 ms 19.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:41.4630868Z triton_convolution2d_330 0.0946 ms 16.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:41.4632212Z triton_convolution2d_331 0.1372 ms 11.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:00:41.4633264Z SingleProcess AUTOTUNE benchmarking takes 0.1671 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:00:41.7280303Z Autotune Choices Stats: 2025-09-07T10:00:41.7281258Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_339", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.007040000054985285, "best_triton_pos": 0} 2025-09-07T10:00:41.7394699Z AUTOTUNE addmm(392x256, 392x128, 128x256) 2025-09-07T10:00:41.7395123Z strides: [0, 1], [128, 1], [1, 128] 2025-09-07T10:00:41.7395420Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:41.7396101Z triton_mm_339 0.0070 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:41.7397109Z triton_mm_337 0.0071 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:41.7398075Z triton_mm_338 0.0072 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:41.7399040Z triton_mm_343 0.0072 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:41.7399646Z bias_addmm 0.0074 ms 95.7% 2025-09-07T10:00:41.7400227Z triton_mm_347 0.0075 ms 93.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:41.7401181Z triton_mm_344 0.0076 ms 92.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:41.7402142Z triton_mm_350 0.0078 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:41.7403100Z triton_mm_348 0.0078 ms 89.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:41.7404226Z triton_mm_345 0.0079 ms 89.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:41.7405136Z SingleProcess AUTOTUNE benchmarking takes 0.2757 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:00:41.8737476Z Autotune Choices Stats: 2025-09-07T10:00:41.8739170Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.013567999936640263, "best_triton_pos": 1, "best_triton_time": 0.016127999871969223, "best_triton_kernel": "triton_convolution2d_722", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:00:41.9094207Z AUTOTUNE convolution(8x128x28x28, 320x128x2x2) 2025-09-07T10:00:41.9094590Z strides: [100352, 1, 3584, 128], [512, 1, 256, 128] 2025-09-07T10:00:41.9094906Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:41.9095512Z convolution 0.0136 ms 100.0% 2025-09-07T10:00:41.9096271Z triton_convolution2d_722 0.0161 ms 84.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:41.9097518Z triton_convolution2d_721 0.0197 ms 68.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:41.9098735Z triton_convolution2d_723 0.0201 ms 67.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:41.9099946Z triton_convolution2d_724 0.0212 ms 64.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:41.9101158Z triton_convolution2d_718 0.0260 ms 52.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:41.9102549Z triton_convolution2d_719 0.0269 ms 50.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:41.9103647Z triton_convolution2d_720 0.0521 ms 26.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:00:41.9104483Z SingleProcess AUTOTUNE benchmarking takes 0.1387 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:00:42.0262163Z Autotune Choices Stats: 2025-09-07T10:00:42.0263515Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.012512000277638435, "best_triton_pos": 1, "best_triton_time": 0.03372799977660179, "best_triton_kernel": "triton_convolution2d_748", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:00:42.0593891Z AUTOTUNE convolution(8x320x14x14, 320x320x2x2) 2025-09-07T10:00:42.0594274Z strides: [62720, 1, 4480, 320], [1280, 1, 640, 320] 2025-09-07T10:00:42.0594576Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:42.0594842Z convolution 0.0125 ms 100.0% 2025-09-07T10:00:42.0595713Z triton_convolution2d_748 0.0337 ms 37.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:42.0597237Z triton_convolution2d_747 0.0413 ms 30.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:42.0598469Z triton_convolution2d_750 0.0414 ms 30.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:42.0599839Z triton_convolution2d_749 0.0417 ms 30.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:42.0601054Z triton_convolution2d_745 0.0583 ms 21.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:42.0602275Z triton_convolution2d_744 0.0584 ms 21.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:42.0603478Z triton_convolution2d_746 0.0892 ms 14.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:00:42.0604365Z SingleProcess AUTOTUNE benchmarking takes 0.1482 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:00:42.3237665Z Autotune Choices Stats: 2025-09-07T10:00:42.3238643Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_759", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.00825599953532219, "best_triton_pos": 0} 2025-09-07T10:00:42.3756394Z AUTOTUNE addmm(392x640, 392x320, 320x640) 2025-09-07T10:00:42.3756679Z strides: [0, 1], [320, 1], [1, 320] 2025-09-07T10:00:42.3756969Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:42.3757681Z triton_mm_759 0.0083 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:42.3758672Z triton_mm_753 0.0084 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:42.3759283Z bias_addmm 0.0086 ms 95.9% 2025-09-07T10:00:42.3759875Z triton_mm_758 0.0086 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:42.3760831Z triton_mm_752 0.0091 ms 90.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:42.3761798Z triton_mm_762 0.0091 ms 90.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:42.3762765Z triton_mm_763 0.0092 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:42.3763701Z triton_mm_754 0.0092 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:42.3764586Z triton_mm_761 0.0094 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:42.3765737Z triton_mm_765 0.0095 ms 86.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:42.3766797Z SingleProcess AUTOTUNE benchmarking takes 0.3146 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:00:42.7014022Z Autotune Choices Stats: 2025-09-07T10:00:42.7016022Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01244799979031086, "best_triton_pos": 1, "best_triton_time": 0.03311999887228012, "best_triton_kernel": "triton_convolution2d_2565", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:00:42.7509147Z AUTOTUNE convolution(8x320x14x14, 512x320x2x2) 2025-09-07T10:00:42.7509525Z strides: [62720, 1, 4480, 320], [1280, 1, 640, 320] 2025-09-07T10:00:42.7509835Z dtypes: torch.float16, torch.float16 2025-09-07T10:00:42.7510137Z convolution 0.0124 ms 100.0% 2025-09-07T10:00:42.7510882Z triton_convolution2d_2565 0.0331 ms 37.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:42.7512162Z triton_convolution2d_2564 0.0410 ms 30.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:42.7513560Z triton_convolution2d_2567 0.0416 ms 29.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:42.7514780Z triton_convolution2d_2566 0.0431 ms 28.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:00:42.7516162Z triton_convolution2d_2561 0.0583 ms 21.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:42.7517403Z triton_convolution2d_2562 0.0586 ms 21.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:00:42.7518634Z triton_convolution2d_2563 0.0849 ms 14.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:00:42.7519611Z SingleProcess AUTOTUNE benchmarking takes 0.1668 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:00:43.0192870Z Autotune Choices Stats: 2025-09-07T10:00:43.0194190Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_2595", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008608000352978706, "best_triton_pos": 0} 2025-09-07T10:00:43.0372568Z AUTOTUNE addmm(392x1024, 392x512, 512x1024) 2025-09-07T10:00:43.0372877Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T10:00:43.0373206Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:43.0373934Z triton_mm_2595 0.0086 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:43.0374581Z bias_addmm 0.0091 ms 94.7% 2025-09-07T10:00:43.0375376Z triton_mm_2599 0.0095 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:43.0376708Z triton_mm_2594 0.0097 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:00:43.0377691Z triton_mm_2598 0.0103 ms 83.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:43.0378703Z triton_mm_2590 0.0107 ms 80.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:43.0379857Z triton_mm_2591 0.0107 ms 80.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:43.0380819Z triton_mm_2588 0.0108 ms 79.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:00:43.0381878Z triton_mm_2601 0.0109 ms 78.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:00:43.0382953Z triton_mm_2605 0.0112 ms 77.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:43.0383728Z SingleProcess AUTOTUNE benchmarking takes 0.2842 seconds and 0.0004 seconds precompiling for 21 choices 2025-09-07T10:00:43.2953897Z Autotune Choices Stats: 2025-09-07T10:00:43.2955210Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_2857", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T10:00:43.3080720Z AUTOTUNE addmm(8x1000, 8x512, 512x1000) 2025-09-07T10:00:43.3080999Z strides: [0, 1], [512, 1], [1, 512] 2025-09-07T10:00:43.3081302Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:00:43.3081979Z triton_mm_2857 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:00:43.3083134Z triton_mm_2861 0.0084 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:43.3083869Z bias_addmm 0.0085 ms 96.6% 2025-09-07T10:00:43.3084485Z triton_mm_2869 0.0090 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:00:43.3085660Z triton_mm_2856 0.0092 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:00:43.3086651Z triton_mm_2855 0.0094 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:00:43.3087627Z triton_mm_2860 0.0094 ms 86.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:43.3088620Z triton_mm_2865 0.0095 ms 86.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:00:43.3089608Z triton_mm_2854 0.0096 ms 85.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:00:43.3090582Z triton_mm_2864 0.0099 ms 82.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:00:43.3091762Z SingleProcess AUTOTUNE benchmarking takes 0.2536 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:01:26.6892207Z Autotune Choices Stats: 2025-09-07T10:01:26.6893288Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_7662", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.016736000776290894, "best_triton_pos": 0} 2025-09-07T10:01:26.7189454Z AUTOTUNE mm(25088x64, 64x512) 2025-09-07T10:01:26.7189821Z strides: [64, 1], [512, 1] 2025-09-07T10:01:26.7190094Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:26.7190786Z triton_mm_7662 0.0167 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:26.7191809Z triton_mm_7669 0.0168 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:26.7192795Z triton_mm_7665 0.0176 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:26.7193761Z triton_mm_7664 0.0178 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:26.7194789Z triton_mm_7666 0.0178 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:26.7195797Z mm 0.0179 ms 93.7% 2025-09-07T10:01:26.7196363Z triton_mm_7671 0.0184 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:26.7197350Z triton_mm_7661 0.0187 ms 89.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:26.7198327Z triton_mm_7663 0.0188 ms 88.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:26.7199304Z triton_mm_7670 0.0189 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:26.7200155Z SingleProcess AUTOTUNE benchmarking takes 0.2465 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T10:01:27.0154886Z Autotune Choices Stats: 2025-09-07T10:01:27.0156609Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01244799979031086, "best_triton_pos": 1, "best_triton_time": 0.013887999579310417, "best_triton_kernel": "triton_mm_6906", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:01:27.0587952Z AUTOTUNE mm(6272x128, 128x1024) 2025-09-07T10:01:27.0588186Z strides: [128, 1], [1024, 1] 2025-09-07T10:01:27.0588447Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:27.0588695Z mm 0.0124 ms 100.0% 2025-09-07T10:01:27.0589255Z triton_mm_6906 0.0139 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:27.0590173Z triton_mm_6909 0.0140 ms 89.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:27.0591063Z triton_mm_6904 0.0141 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:27.0592359Z triton_mm_6910 0.0141 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:27.0593260Z triton_mm_6902 0.0141 ms 88.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:27.0594340Z triton_mm_6903 0.0150 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:27.0595506Z triton_mm_6911 0.0155 ms 80.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:27.0596480Z triton_mm_6899 0.0156 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:01:27.0597442Z triton_mm_6907 0.0156 ms 79.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:27.0598280Z SingleProcess AUTOTUNE benchmarking takes 0.2491 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:27.4944212Z Autotune Choices Stats: 2025-09-07T10:01:27.4945746Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_3490", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.010367999784648418, "best_triton_pos": 0} 2025-09-07T10:01:27.5274349Z AUTOTUNE mm(1568x320, 320x1280) 2025-09-07T10:01:27.5274647Z strides: [320, 1], [1280, 1] 2025-09-07T10:01:27.5275107Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:27.5275899Z triton_mm_3490 0.0104 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:27.5276979Z triton_mm_3484 0.0105 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:27.5277638Z mm 0.0106 ms 97.6% 2025-09-07T10:01:27.5278271Z triton_mm_3491 0.0107 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:27.5279330Z triton_mm_3483 0.0108 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:27.5280360Z triton_mm_3487 0.0109 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:27.5281398Z triton_mm_3482 0.0112 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:27.5282426Z triton_mm_3480 0.0113 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:27.5283462Z triton_mm_3489 0.0115 ms 90.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:27.5284500Z triton_mm_3486 0.0119 ms 87.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:27.5285956Z SingleProcess AUTOTUNE benchmarking takes 0.3562 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:28.4458117Z Autotune Choices Stats: 2025-09-07T10:01:28.4459457Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.009440000168979168, "best_triton_pos": 1, "best_triton_time": 0.009664000011980534, "best_triton_kernel": "triton_mm_2934", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T10:01:28.6132605Z AUTOTUNE mm(512x392, 392x2048) 2025-09-07T10:01:28.6132899Z strides: [1, 512], [2048, 1] 2025-09-07T10:01:28.6133175Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:28.6133459Z mm 0.0094 ms 100.0% 2025-09-07T10:01:28.6134108Z triton_mm_2934 0.0097 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:28.6135501Z triton_mm_2932 0.0100 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:28.6136423Z triton_mm_2936 0.0100 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:28.6137322Z triton_mm_2929 0.0101 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:28.6138208Z triton_mm_2933 0.0102 ms 92.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:28.6139104Z triton_mm_2940 0.0106 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:28.6140014Z triton_mm_2930 0.0111 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:28.6140919Z triton_mm_2931 0.0111 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:28.6141905Z triton_mm_2939 0.0113 ms 83.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:28.6142691Z SingleProcess AUTOTUNE benchmarking takes 0.3680 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:28.8379578Z Autotune Choices Stats: 2025-09-07T10:01:28.8380843Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.009312000125646591, "best_triton_pos": 1, "best_triton_time": 0.009855999611318111, "best_triton_kernel": "triton_mm_2970", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8"} 2025-09-07T10:01:28.8552673Z AUTOTUNE mm(2048x392, 392x512) 2025-09-07T10:01:28.8553023Z strides: [1, 2048], [512, 1] 2025-09-07T10:01:28.8553292Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:28.8553556Z mm 0.0093 ms 100.0% 2025-09-07T10:01:28.8554201Z triton_mm_2970 0.0099 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:28.8555609Z triton_mm_2971 0.0100 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:28.8556620Z triton_mm_2972 0.0101 ms 92.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:28.8558029Z triton_mm_2974 0.0102 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:28.8559004Z triton_mm_2967 0.0103 ms 90.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:28.8560231Z triton_mm_2978 0.0106 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:28.8561234Z triton_mm_2977 0.0112 ms 83.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:28.8562219Z triton_mm_2969 0.0114 ms 82.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:28.8563194Z triton_mm_2968 0.0115 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:28.8564040Z SingleProcess AUTOTUNE benchmarking takes 0.2171 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:29.1284088Z Autotune Choices Stats: 2025-09-07T10:01:29.1285719Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.009503999724984169, "best_triton_pos": 1, "best_triton_time": 0.010048000141978264, "best_triton_kernel": "triton_mm_2915", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T10:01:29.1763645Z AUTOTUNE mm(392x512, 512x2048) 2025-09-07T10:01:29.1763963Z strides: [512, 1], [2048, 1] 2025-09-07T10:01:29.1764230Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:29.1764500Z mm 0.0095 ms 100.0% 2025-09-07T10:01:29.1765474Z triton_mm_2915 0.0100 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:29.1766464Z triton_mm_2914 0.0102 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:29.1767454Z triton_mm_2910 0.0103 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:29.1768462Z triton_mm_2921 0.0108 ms 88.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:29.1769453Z triton_mm_2913 0.0108 ms 88.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:29.1770409Z triton_mm_2917 0.0109 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:29.1771383Z triton_mm_2920 0.0116 ms 82.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:29.1772347Z triton_mm_2911 0.0116 ms 81.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:29.1773335Z triton_mm_2912 0.0120 ms 79.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:29.1774580Z SingleProcess AUTOTUNE benchmarking takes 0.2478 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:29.7271975Z Autotune Choices Stats: 2025-09-07T10:01:29.7273025Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_3044", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008448000065982342, "best_triton_pos": 0} 2025-09-07T10:01:29.9015624Z AUTOTUNE mm(1024x392, 392x512) 2025-09-07T10:01:29.9016014Z strides: [1, 1024], [512, 1] 2025-09-07T10:01:29.9016246Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:29.9016843Z triton_mm_3044 0.0084 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:29.9017409Z mm 0.0085 ms 99.2% 2025-09-07T10:01:29.9017937Z triton_mm_3048 0.0092 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:29.9018811Z triton_mm_3043 0.0092 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:29.9019683Z triton_mm_3046 0.0095 ms 88.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:29.9020535Z triton_mm_3047 0.0096 ms 87.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:29.9021388Z triton_mm_3050 0.0097 ms 87.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:29.9022355Z triton_mm_3039 0.0099 ms 85.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:29.9023222Z triton_mm_3054 0.0102 ms 82.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:29.9024088Z triton_mm_3038 0.0105 ms 80.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:29.9024850Z SingleProcess AUTOTUNE benchmarking takes 0.3733 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:30.1423853Z Autotune Choices Stats: 2025-09-07T10:01:30.1424906Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_2894", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.006304000038653612, "best_triton_pos": 0} 2025-09-07T10:01:30.3799475Z AUTOTUNE mm(1000x8, 8x512) 2025-09-07T10:01:30.3799794Z strides: [1, 1000], [512, 1] 2025-09-07T10:01:30.3800066Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:30.3800758Z triton_mm_2894 0.0063 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:30.3801822Z triton_mm_2892 0.0064 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:01:30.3802816Z triton_mm_2887 0.0064 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:01:30.3804231Z triton_mm_2889 0.0065 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:30.3805599Z triton_mm_2891 0.0065 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:30.3806640Z triton_mm_2893 0.0065 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:30.3807767Z triton_mm_2888 0.0065 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:01:30.3808678Z triton_mm_2890 0.0066 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:30.3809599Z triton_mm_2895 0.0066 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:30.3810510Z triton_mm_2896 0.0067 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:30.3811296Z SingleProcess AUTOTUNE benchmarking takes 0.4001 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:01:30.7978198Z Autotune Choices Stats: 2025-09-07T10:01:30.7979584Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010847999714314938, "best_triton_pos": 1, "best_triton_time": 0.011872000060975552, "best_triton_kernel": "triton_mm_3500", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:30.8299912Z AUTOTUNE mm(320x1568, 1568x1280) 2025-09-07T10:01:30.8300327Z strides: [1, 320], [1280, 1] 2025-09-07T10:01:30.8300601Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:30.8300873Z mm 0.0108 ms 100.0% 2025-09-07T10:01:30.8301619Z triton_mm_3500 0.0119 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:30.8302683Z triton_mm_3504 0.0136 ms 80.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:30.8303702Z triton_mm_3496 0.0165 ms 65.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:30.8304687Z triton_mm_3499 0.0171 ms 63.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:30.8306112Z triton_mm_3510 0.0174 ms 62.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:30.8307138Z triton_mm_3503 0.0175 ms 62.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:30.8308118Z triton_mm_3502 0.0177 ms 61.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:30.8309083Z triton_mm_3506 0.0178 ms 61.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:30.8310042Z triton_mm_3495 0.0179 ms 60.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:30.8311262Z SingleProcess AUTOTUNE benchmarking takes 0.2537 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:31.2479669Z Autotune Choices Stats: 2025-09-07T10:01:31.2481410Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010847999714314938, "best_triton_pos": 1, "best_triton_time": 0.012095999903976917, "best_triton_kernel": "triton_mm_3538", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:31.2889362Z AUTOTUNE mm(1280x1568, 1568x320) 2025-09-07T10:01:31.2889657Z strides: [1, 1280], [320, 1] 2025-09-07T10:01:31.2889921Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:31.2890191Z mm 0.0108 ms 100.0% 2025-09-07T10:01:31.2890827Z triton_mm_3538 0.0121 ms 89.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:31.2891883Z triton_mm_3542 0.0138 ms 78.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:31.2892896Z triton_mm_3534 0.0167 ms 64.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:31.2893891Z triton_mm_3537 0.0172 ms 63.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:31.2894890Z triton_mm_3548 0.0174 ms 62.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:31.2896322Z triton_mm_3544 0.0176 ms 61.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:31.2897270Z triton_mm_3540 0.0176 ms 61.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:31.2898186Z triton_mm_3541 0.0177 ms 61.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:31.2899091Z triton_mm_3533 0.0180 ms 60.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:31.2899871Z SingleProcess AUTOTUNE benchmarking takes 0.2627 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:31.9577581Z Autotune Choices Stats: 2025-09-07T10:01:31.9579302Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_3006", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008063999935984612, "best_triton_pos": 0} 2025-09-07T10:01:31.9859966Z AUTOTUNE mm(512x392, 392x512) 2025-09-07T10:01:31.9860246Z strides: [1, 512], [512, 1] 2025-09-07T10:01:31.9860502Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:31.9861184Z triton_mm_3006 0.0081 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:31.9861897Z mm 0.0081 ms 99.6% 2025-09-07T10:01:31.9862477Z triton_mm_3002 0.0082 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:31.9863436Z triton_mm_3001 0.0086 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:31.9864790Z triton_mm_3000 0.0088 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:31.9868937Z triton_mm_3010 0.0090 ms 89.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:31.9870151Z triton_mm_3005 0.0091 ms 88.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:31.9871125Z triton_mm_3012 0.0093 ms 86.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:31.9872098Z triton_mm_3008 0.0094 ms 86.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:31.9873055Z triton_mm_3009 0.0095 ms 85.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:31.9873914Z SingleProcess AUTOTUNE benchmarking takes 0.2231 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:32.2705648Z Autotune Choices Stats: 2025-09-07T10:01:32.2706972Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.007968000136315823, "best_triton_pos": 1, "best_triton_time": 0.008224000222980976, "best_triton_kernel": "triton_mm_3610", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:32.3379504Z AUTOTUNE mm(640x392, 392x320) 2025-09-07T10:01:32.3379862Z strides: [1, 640], [320, 1] 2025-09-07T10:01:32.3380129Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:32.3380416Z mm 0.0080 ms 100.0% 2025-09-07T10:01:32.3381046Z triton_mm_3610 0.0082 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:32.3382170Z triton_mm_3614 0.0082 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:32.3383148Z triton_mm_3609 0.0087 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:32.3384113Z triton_mm_3608 0.0089 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:32.3385256Z triton_mm_3613 0.0090 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:32.3386286Z triton_mm_3618 0.0091 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:32.3387373Z triton_mm_3616 0.0093 ms 85.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:32.3388363Z triton_mm_3620 0.0095 ms 83.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:32.3389329Z triton_mm_3617 0.0096 ms 83.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:32.3390616Z SingleProcess AUTOTUNE benchmarking takes 0.2617 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:33.2738816Z Autotune Choices Stats: 2025-09-07T10:01:33.2741654Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01692800037562847, "best_triton_pos": 1, "best_triton_time": 0.02038400061428547, "best_triton_kernel": "triton_mm_6916", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:33.2854920Z AUTOTUNE mm(128x6272, 6272x1024) 2025-09-07T10:01:33.2855671Z strides: [1, 128], [1024, 1] 2025-09-07T10:01:33.2856120Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:33.2856563Z mm 0.0169 ms 100.0% 2025-09-07T10:01:33.2857538Z triton_mm_6916 0.0204 ms 83.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:33.2859147Z triton_mm_6920 0.0216 ms 78.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:33.2860705Z triton_mm_6924 0.0255 ms 66.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:33.2862398Z triton_mm_6930 0.0408 ms 41.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:33.2863939Z triton_mm_6915 0.0419 ms 40.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:33.2865733Z triton_mm_6914 0.0434 ms 39.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:33.2867471Z triton_mm_6923 0.0447 ms 37.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:33.2868437Z triton_mm_6919 0.0452 ms 37.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:33.2869409Z triton_mm_6929 0.0502 ms 33.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:33.2870266Z SingleProcess AUTOTUNE benchmarking takes 0.3271 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:33.6440004Z Autotune Choices Stats: 2025-09-07T10:01:33.6441911Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.016672000288963318, "best_triton_pos": 1, "best_triton_time": 0.020608000457286835, "best_triton_kernel": "triton_mm_6954", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:33.7366416Z AUTOTUNE mm(1024x6272, 6272x128) 2025-09-07T10:01:33.7366827Z strides: [1, 1024], [128, 1] 2025-09-07T10:01:33.7367247Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:33.7367686Z mm 0.0167 ms 100.0% 2025-09-07T10:01:33.7368656Z triton_mm_6954 0.0206 ms 80.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:33.7370211Z triton_mm_6958 0.0214 ms 77.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:33.7372208Z triton_mm_6962 0.0250 ms 66.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:33.7373754Z triton_mm_6968 0.0394 ms 42.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:33.7376102Z triton_mm_6953 0.0413 ms 40.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:33.7377730Z triton_mm_6952 0.0434 ms 38.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:33.7378678Z triton_mm_6957 0.0441 ms 37.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:33.7379646Z triton_mm_6961 0.0443 ms 37.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:33.7380608Z triton_mm_6967 0.0499 ms 33.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:33.7381448Z SingleProcess AUTOTUNE benchmarking takes 0.4052 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:34.0773562Z Autotune Choices Stats: 2025-09-07T10:01:34.0775907Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.0098879998549819, "best_triton_pos": 1, "best_triton_time": 0.011071999557316303, "best_triton_kernel": "triton_mm_3576", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:34.0918800Z AUTOTUNE mm(320x1568, 1568x320) 2025-09-07T10:01:34.0919236Z strides: [1, 320], [320, 1] 2025-09-07T10:01:34.0919627Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:34.0920038Z mm 0.0099 ms 100.0% 2025-09-07T10:01:34.0920955Z triton_mm_3576 0.0111 ms 89.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:34.0922499Z triton_mm_3572 0.0111 ms 88.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:34.0923979Z triton_mm_3580 0.0131 ms 75.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:34.0925751Z triton_mm_3571 0.0147 ms 67.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:34.0927296Z triton_mm_3570 0.0153 ms 64.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:34.0928221Z triton_mm_3575 0.0162 ms 60.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:34.0929134Z triton_mm_3586 0.0163 ms 60.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:34.0930054Z triton_mm_3579 0.0166 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:34.0930953Z triton_mm_3578 0.0169 ms 58.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:34.0932118Z SingleProcess AUTOTUNE benchmarking takes 0.2302 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:34.9473701Z Autotune Choices Stats: 2025-09-07T10:01:34.9475739Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_7030", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008063999935984612, "best_triton_pos": 0} 2025-09-07T10:01:34.9586300Z AUTOTUNE mm(256x392, 392x128) 2025-09-07T10:01:34.9586650Z strides: [1, 256], [128, 1] 2025-09-07T10:01:34.9586996Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:34.9587840Z triton_mm_7030 0.0081 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:34.9588614Z mm 0.0081 ms 99.2% 2025-09-07T10:01:34.9589306Z triton_mm_7034 0.0082 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:34.9590483Z triton_mm_7029 0.0086 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:34.9591646Z triton_mm_7028 0.0089 ms 91.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:34.9592814Z triton_mm_7038 0.0089 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:34.9593977Z triton_mm_7033 0.0089 ms 90.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:34.9595425Z triton_mm_7037 0.0094 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:34.9596634Z triton_mm_7027 0.0094 ms 85.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:01:34.9597777Z triton_mm_7036 0.0095 ms 84.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:34.9598638Z SingleProcess AUTOTUNE benchmarking takes 0.2112 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:01:37.0257299Z Autotune Choices Stats: 2025-09-07T10:01:37.0258779Z {"num_choices": 27, "num_triton_choices": 17, "best_kernel": "decompose_k_mm_7_split_75", "best_kernel_desc": "k_split=7", "best_time": 0.02393599972128868, "best_triton_pos": 10, "best_triton_time": 0.055904000997543335, "best_triton_kernel": "triton_mm_7676", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:37.1011745Z AUTOTUNE mm(64x25088, 25088x512) 2025-09-07T10:01:37.1012097Z strides: [1, 64], [512, 1] 2025-09-07T10:01:37.1012396Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:37.1012771Z decompose_k_mm_7_split_75 0.0239 ms 100.0% k_split=7 2025-09-07T10:01:37.1013144Z decompose_k_mm_4_split_74 0.0240 ms 99.9% k_split=4 2025-09-07T10:01:37.1013467Z mm 0.0244 ms 98.3% 2025-09-07T10:01:37.1013725Z decompose_k_mm_2_split_73 0.0296 ms 80.9% k_split=2 2025-09-07T10:01:37.1014072Z decompose_k_mm_14_split_77 0.0434 ms 55.1% k_split=14 2025-09-07T10:01:37.1014438Z decompose_k_mm_8_split_76 0.0435 ms 55.1% k_split=8 2025-09-07T10:01:37.1015663Z decompose_k_mm_32_split_80 0.0439 ms 54.6% k_split=32 2025-09-07T10:01:37.1016036Z decompose_k_mm_16_split_78 0.0439 ms 54.5% k_split=16 2025-09-07T10:01:37.1016384Z decompose_k_mm_28_split_79 0.0440 ms 54.4% k_split=28 2025-09-07T10:01:37.1016722Z decompose_k_mm_49_split_72 0.0444 ms 54.0% k_split=49 2025-09-07T10:01:37.1017250Z SingleProcess AUTOTUNE benchmarking takes 1.9104 seconds and 0.0002 seconds precompiling for 27 choices 2025-09-07T10:01:38.2397167Z Autotune Choices Stats: 2025-09-07T10:01:38.2399388Z {"num_choices": 28, "num_triton_choices": 18, "best_kernel": "decompose_k_mm_4_split_83", "best_kernel_desc": "k_split=4", "best_time": 0.023744000121951103, "best_triton_pos": 10, "best_triton_time": 0.056223999708890915, "best_triton_kernel": "triton_mm_7711", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:38.2535316Z AUTOTUNE mm(512x25088, 25088x64) 2025-09-07T10:01:38.2535637Z strides: [1, 512], [64, 1] 2025-09-07T10:01:38.2535897Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:38.2536213Z decompose_k_mm_4_split_83 0.0237 ms 100.0% k_split=4 2025-09-07T10:01:38.2536586Z decompose_k_mm_7_split_84 0.0244 ms 97.5% k_split=7 2025-09-07T10:01:38.2536884Z mm 0.0246 ms 96.4% 2025-09-07T10:01:38.2537132Z decompose_k_mm_2_split_82 0.0291 ms 81.5% k_split=2 2025-09-07T10:01:38.2537471Z decompose_k_mm_16_split_87 0.0434 ms 54.7% k_split=16 2025-09-07T10:01:38.2537819Z decompose_k_mm_28_split_88 0.0434 ms 54.7% k_split=28 2025-09-07T10:01:38.2538155Z decompose_k_mm_14_split_86 0.0435 ms 54.6% k_split=14 2025-09-07T10:01:38.2538498Z decompose_k_mm_8_split_85 0.0435 ms 54.6% k_split=8 2025-09-07T10:01:38.2538842Z decompose_k_mm_32_split_89 0.0438 ms 54.2% k_split=32 2025-09-07T10:01:38.2539167Z decompose_k_mm_49_split_81 0.0439 ms 54.1% k_split=49 2025-09-07T10:01:38.2539705Z SingleProcess AUTOTUNE benchmarking takes 1.1178 seconds and 0.0002 seconds precompiling for 28 choices 2025-09-07T10:01:39.4860363Z Autotune Choices Stats: 2025-09-07T10:01:39.4861693Z {"num_choices": 29, "num_triton_choices": 19, "best_kernel": "decompose_k_mm_7_split_3", "best_kernel_desc": "k_split=7", "best_time": 0.01228800043463707, "best_triton_pos": 4, "best_triton_time": 0.018751999363303185, "best_triton_kernel": "triton_mm_6992", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:39.5364686Z AUTOTUNE mm(128x6272, 6272x128) 2025-09-07T10:01:39.5365323Z strides: [1, 128], [128, 1] 2025-09-07T10:01:39.5365597Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:39.5365937Z decompose_k_mm_7_split_3 0.0123 ms 100.0% k_split=7 2025-09-07T10:01:39.5366315Z decompose_k_mm_4_split_2 0.0128 ms 95.8% k_split=4 2025-09-07T10:01:39.5366622Z mm 0.0133 ms 92.5% 2025-09-07T10:01:39.5366893Z decompose_k_mm_2_split_1 0.0144 ms 85.1% k_split=2 2025-09-07T10:01:39.5367616Z triton_mm_6992 0.0188 ms 65.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:39.5368683Z triton_mm_6996 0.0209 ms 58.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:39.5369310Z decompose_k_mm_16_split_7 0.0215 ms 57.1% k_split=16 2025-09-07T10:01:39.5369644Z decompose_k_mm_28_split_5 0.0221 ms 55.6% k_split=28 2025-09-07T10:01:39.5369956Z decompose_k_mm_8_split_6 0.0222 ms 55.4% k_split=8 2025-09-07T10:01:39.5370258Z decompose_k_mm_49_split_0 0.0223 ms 55.1% k_split=49 2025-09-07T10:01:39.5370755Z SingleProcess AUTOTUNE benchmarking takes 1.2143 seconds and 0.0002 seconds precompiling for 29 choices 2025-09-07T10:01:39.7950133Z Autotune Choices Stats: 2025-09-07T10:01:39.7951483Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_7788", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.0077760000713169575, "best_triton_pos": 0} 2025-09-07T10:01:39.8630979Z AUTOTUNE mm(128x392, 392x64) 2025-09-07T10:01:39.8631234Z strides: [1, 128], [64, 1] 2025-09-07T10:01:39.8631478Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:39.8632496Z triton_mm_7788 0.0078 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:39.8633145Z mm 0.0079 ms 98.4% 2025-09-07T10:01:39.8633730Z triton_mm_7784 0.0080 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:39.8634729Z triton_mm_7780 0.0081 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:39.8636005Z triton_mm_7787 0.0084 ms 92.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:39.8636987Z triton_mm_7779 0.0085 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:39.8637957Z triton_mm_7783 0.0085 ms 91.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:39.8638927Z triton_mm_7786 0.0087 ms 89.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:39.8639899Z triton_mm_7793 0.0087 ms 89.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:39.8640797Z triton_mm_7778 0.0088 ms 88.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:39.8641581Z SingleProcess AUTOTUNE benchmarking takes 0.2533 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:01:42.3619409Z Autotune Choices Stats: 2025-09-07T10:01:42.3620928Z {"num_choices": 26, "num_triton_choices": 15, "best_kernel": "decompose_k_mm_49_split_90", "best_kernel_desc": "k_split=49", "best_time": 0.013504000380635262, "best_triton_pos": 11, "best_triton_time": 0.051872000098228455, "best_triton_kernel": "triton_mm_7747", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:42.4702932Z AUTOTUNE mm(64x25088, 25088x64) 2025-09-07T10:01:42.4703340Z strides: [1, 64], [64, 1] 2025-09-07T10:01:42.4703602Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:42.4703963Z decompose_k_mm_49_split_90 0.0135 ms 100.0% k_split=49 2025-09-07T10:01:42.4704328Z decompose_k_mm_98_split_91 0.0139 ms 97.0% k_split=98 2025-09-07T10:01:42.4704634Z mm 0.0142 ms 94.8% 2025-09-07T10:01:42.4704893Z decompose_k_mm_7_split_95 0.0153 ms 88.3% k_split=7 2025-09-07T10:01:42.4705624Z decompose_k_mm_16_split_98 0.0160 ms 84.6% k_split=16 2025-09-07T10:01:42.4705973Z decompose_k_mm_28_split_99 0.0160 ms 84.4% k_split=28 2025-09-07T10:01:42.4706323Z decompose_k_mm_14_split_97 0.0167 ms 81.0% k_split=14 2025-09-07T10:01:42.4706671Z decompose_k_mm_8_split_96 0.0172 ms 78.7% k_split=8 2025-09-07T10:01:42.4707010Z decompose_k_mm_4_split_94 0.0182 ms 74.3% k_split=4 2025-09-07T10:01:42.4707843Z decompose_k_mm_196_split_92 0.0190 ms 71.0% k_split=196 2025-09-07T10:01:42.4708394Z SingleProcess AUTOTUNE benchmarking takes 2.2288 seconds and 0.0002 seconds precompiling for 26 choices 2025-09-07T10:01:45.3094212Z Autotune Choices Stats: 2025-09-07T10:01:45.3096800Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "mm", "best_time": 0.008832000195980072, "best_triton_pos": 1, "best_triton_time": 0.009247999638319016, "best_triton_kernel": "triton_mm_2874", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2"} 2025-09-07T10:01:45.3867752Z AUTOTUNE mm(8x1000, 1000x512) 2025-09-07T10:01:45.3868167Z strides: [1000, 1], [512, 1] 2025-09-07T10:01:45.3868458Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:45.3868763Z mm 0.0088 ms 100.0% 2025-09-07T10:01:45.3869441Z triton_mm_2874 0.0092 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:01:45.3870631Z triton_mm_2878 0.0096 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:45.3871731Z triton_mm_2882 0.0098 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:45.3872831Z triton_mm_2872 0.0111 ms 79.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:45.3873900Z triton_mm_2873 0.0112 ms 78.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:01:45.3875202Z triton_mm_2886 0.0116 ms 76.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:45.3876480Z triton_mm_2877 0.0116 ms 76.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:45.3877671Z triton_mm_2884 0.0124 ms 71.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:45.3878816Z triton_mm_2881 0.0128 ms 69.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:45.3879762Z SingleProcess AUTOTUNE benchmarking takes 0.2665 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:01:45.6196946Z Autotune Choices Stats: 2025-09-07T10:01:45.6198189Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.011071999557316303, "best_triton_pos": 1, "best_triton_time": 0.011296000331640244, "best_triton_kernel": "triton_mm_2945", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:45.8603480Z AUTOTUNE mm(392x2048, 2048x512) 2025-09-07T10:01:45.8603790Z strides: [2048, 1], [512, 1] 2025-09-07T10:01:45.8604065Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:45.8604342Z mm 0.0111 ms 100.0% 2025-09-07T10:01:45.8606318Z triton_mm_2945 0.0113 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:45.8607400Z triton_mm_2949 0.0116 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:45.8608753Z triton_mm_2953 0.0142 ms 78.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:45.8609774Z triton_mm_2943 0.0177 ms 62.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:45.8610752Z triton_mm_2944 0.0180 ms 61.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:45.8611803Z triton_mm_2959 0.0184 ms 60.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:45.8612647Z triton_mm_2948 0.0188 ms 58.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:45.8613479Z triton_mm_2952 0.0189 ms 58.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:45.8614317Z triton_mm_2958 0.0208 ms 53.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:45.8615202Z SingleProcess AUTOTUNE benchmarking takes 0.4711 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T10:01:46.0551240Z Autotune Choices Stats: 2025-09-07T10:01:46.0552201Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_2983", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.007840000092983246, "best_triton_pos": 0} 2025-09-07T10:01:46.1517119Z AUTOTUNE mm(392x512, 512x512) 2025-09-07T10:01:46.1517366Z strides: [512, 1], [512, 1] 2025-09-07T10:01:46.1517608Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:46.1518257Z triton_mm_2983 0.0078 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:46.1519265Z triton_mm_2987 0.0080 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:46.1519878Z mm 0.0083 ms 95.0% 2025-09-07T10:01:46.1520451Z triton_mm_2982 0.0088 ms 88.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:46.1521416Z triton_mm_2991 0.0089 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:46.1522323Z triton_mm_2981 0.0091 ms 86.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:46.1523209Z triton_mm_2986 0.0091 ms 86.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:46.1524111Z triton_mm_2990 0.0095 ms 82.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:46.1525361Z triton_mm_2980 0.0099 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:01:46.1526270Z triton_mm_2989 0.0100 ms 78.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:46.1527323Z SingleProcess AUTOTUNE benchmarking takes 0.2898 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:46.3564317Z Autotune Choices Stats: 2025-09-07T10:01:46.3565628Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_3021", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.00902399979531765, "best_triton_pos": 0} 2025-09-07T10:01:46.3768771Z AUTOTUNE mm(392x1024, 1024x512) 2025-09-07T10:01:46.3769375Z strides: [1024, 1], [512, 1] 2025-09-07T10:01:46.3769647Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:46.3770294Z triton_mm_3021 0.0090 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:46.3770923Z mm 0.0096 ms 94.3% 2025-09-07T10:01:46.3771520Z triton_mm_3025 0.0096 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:46.3772489Z triton_mm_3029 0.0104 ms 86.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:46.3773481Z triton_mm_3020 0.0117 ms 77.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:46.3774433Z triton_mm_3019 0.0118 ms 76.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:46.3775562Z triton_mm_3024 0.0123 ms 73.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:46.3776521Z triton_mm_3028 0.0124 ms 72.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:46.3777483Z triton_mm_3035 0.0126 ms 71.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:46.3778444Z triton_mm_3027 0.0135 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:46.3779277Z SingleProcess AUTOTUNE benchmarking takes 0.2246 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:46.6243248Z Autotune Choices Stats: 2025-09-07T10:01:46.6244454Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010432000271975994, "best_triton_pos": 1, "best_triton_time": 0.010847999714314938, "best_triton_kernel": "triton_mm_3519", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:46.6455774Z AUTOTUNE mm(1568x1280, 1280x320) 2025-09-07T10:01:46.6456057Z strides: [1280, 1], [320, 1] 2025-09-07T10:01:46.6456321Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:46.6456585Z mm 0.0104 ms 100.0% 2025-09-07T10:01:46.6457203Z triton_mm_3519 0.0108 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:46.6458198Z triton_mm_3523 0.0118 ms 88.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:46.6459168Z triton_mm_3515 0.0142 ms 73.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:46.6460448Z triton_mm_3529 0.0142 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:46.6461340Z triton_mm_3518 0.0143 ms 72.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:46.6462431Z triton_mm_3522 0.0148 ms 70.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:46.6463287Z triton_mm_3514 0.0162 ms 64.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:46.6464120Z triton_mm_3521 0.0164 ms 63.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:46.6465128Z triton_mm_3528 0.0166 ms 62.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:46.6465874Z SingleProcess AUTOTUNE benchmarking takes 0.2328 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:46.8373985Z Autotune Choices Stats: 2025-09-07T10:01:46.8375552Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.00800000037997961, "best_triton_pos": 1, "best_triton_time": 0.008191999979317188, "best_triton_kernel": "triton_mm_3557", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:01:46.8528759Z AUTOTUNE mm(1568x320, 320x320) 2025-09-07T10:01:46.8529031Z strides: [320, 1], [320, 1] 2025-09-07T10:01:46.8529287Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:46.8529553Z mm 0.0080 ms 100.0% 2025-09-07T10:01:46.8530157Z triton_mm_3557 0.0082 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:46.8531148Z triton_mm_3556 0.0083 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:46.8532141Z triton_mm_3560 0.0087 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:46.8533144Z triton_mm_3561 0.0088 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:46.8534148Z triton_mm_3559 0.0089 ms 89.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:46.8535353Z triton_mm_3563 0.0091 ms 87.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:46.8536315Z triton_mm_3552 0.0092 ms 87.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:46.8537288Z triton_mm_3567 0.0094 ms 84.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:46.8538253Z triton_mm_3551 0.0095 ms 84.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:46.8539393Z SingleProcess AUTOTUNE benchmarking takes 0.2058 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:47.0500761Z Autotune Choices Stats: 2025-09-07T10:01:47.0501822Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_3591", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008191999979317188, "best_triton_pos": 0} 2025-09-07T10:01:47.1262352Z AUTOTUNE mm(392x640, 640x320) 2025-09-07T10:01:47.1262903Z strides: [640, 1], [320, 1] 2025-09-07T10:01:47.1263177Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:47.1263852Z triton_mm_3591 0.0082 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:47.1264516Z mm 0.0083 ms 99.2% 2025-09-07T10:01:47.1265458Z triton_mm_3595 0.0084 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:47.1266464Z triton_mm_3599 0.0092 ms 89.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:47.1267440Z triton_mm_3590 0.0095 ms 85.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:47.1268419Z triton_mm_3594 0.0098 ms 83.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:47.1269394Z triton_mm_3589 0.0099 ms 83.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:47.1270376Z triton_mm_3598 0.0101 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:47.1271508Z triton_mm_3588 0.0106 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:01:47.1272481Z triton_mm_3605 0.0107 ms 76.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:47.1273334Z SingleProcess AUTOTUNE benchmarking takes 0.2717 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:47.6591292Z Autotune Choices Stats: 2025-09-07T10:01:47.6592515Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.013439999893307686, "best_triton_pos": 1, "best_triton_time": 0.013663999736309052, "best_triton_kernel": "triton_mm_6943", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T10:01:47.7183670Z AUTOTUNE mm(6272x1024, 1024x128) 2025-09-07T10:01:47.7183962Z strides: [1024, 1], [128, 1] 2025-09-07T10:01:47.7184231Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:47.7184497Z mm 0.0134 ms 100.0% 2025-09-07T10:01:47.7185546Z triton_mm_6943 0.0137 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:47.7186562Z triton_mm_6949 0.0150 ms 89.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:47.7187561Z triton_mm_6942 0.0156 ms 86.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:47.7188824Z triton_mm_6938 0.0158 ms 85.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:47.7189815Z triton_mm_6939 0.0161 ms 83.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:47.7191084Z triton_mm_6948 0.0170 ms 79.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:47.7192112Z triton_mm_6941 0.0177 ms 76.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:47.7193084Z triton_mm_6945 0.0184 ms 73.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:47.7194062Z triton_mm_6935 0.0188 ms 71.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:47.7194905Z SingleProcess AUTOTUNE benchmarking takes 0.2718 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:47.9072084Z Autotune Choices Stats: 2025-09-07T10:01:47.9073068Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_6980", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.00800000037997961, "best_triton_pos": 0} 2025-09-07T10:01:48.1705832Z AUTOTUNE mm(6272x128, 128x128) 2025-09-07T10:01:48.1706108Z strides: [128, 1], [128, 1] 2025-09-07T10:01:48.1706395Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:48.1707089Z triton_mm_6980 0.0080 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:48.1708126Z triton_mm_6977 0.0081 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:48.1709116Z triton_mm_6983 0.0081 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:48.1710096Z triton_mm_6981 0.0082 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:48.1711194Z triton_mm_6978 0.0082 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:48.1711861Z mm 0.0082 ms 97.3% 2025-09-07T10:01:48.1712424Z triton_mm_6979 0.0082 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:48.1713387Z triton_mm_6982 0.0082 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:48.1714350Z triton_mm_6976 0.0083 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:48.1715609Z triton_mm_6973 0.0086 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:48.1716457Z SingleProcess AUTOTUNE benchmarking takes 0.4500 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:48.3714680Z Autotune Choices Stats: 2025-09-07T10:01:48.3715915Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_7011", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.007104000076651573, "best_triton_pos": 0} 2025-09-07T10:01:48.4137308Z AUTOTUNE mm(392x256, 256x128) 2025-09-07T10:01:48.4137573Z strides: [256, 1], [128, 1] 2025-09-07T10:01:48.4137827Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:48.4138801Z triton_mm_7011 0.0071 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:48.4139816Z triton_mm_7015 0.0073 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:48.4140448Z mm 0.0074 ms 96.1% 2025-09-07T10:01:48.4141059Z triton_mm_7010 0.0075 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:48.4142034Z triton_mm_7014 0.0075 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:48.4142866Z triton_mm_7009 0.0077 ms 92.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:48.4143703Z triton_mm_7008 0.0077 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:01:48.4144536Z triton_mm_7018 0.0078 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:48.4145564Z triton_mm_7019 0.0078 ms 91.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:48.4146415Z triton_mm_7017 0.0082 ms 86.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:48.4147147Z SingleProcess AUTOTUNE benchmarking takes 0.2320 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:01:48.7384809Z Autotune Choices Stats: 2025-09-07T10:01:48.7386145Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_7700", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.019007999449968338, "best_triton_pos": 0} 2025-09-07T10:01:48.8048428Z AUTOTUNE mm(25088x512, 512x64) 2025-09-07T10:01:48.8048716Z strides: [512, 1], [64, 1] 2025-09-07T10:01:48.8048997Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:48.8049697Z triton_mm_7700 0.0190 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:48.8050688Z triton_mm_7696 0.0191 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:48.8051688Z triton_mm_7705 0.0196 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:48.8052309Z mm 0.0201 ms 94.7% 2025-09-07T10:01:48.8052896Z triton_mm_7706 0.0209 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:48.8054196Z triton_mm_7697 0.0212 ms 89.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:48.8055340Z triton_mm_7698 0.0214 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:48.8056596Z triton_mm_7699 0.0215 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:48.8057560Z triton_mm_7701 0.0217 ms 87.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:48.8058520Z triton_mm_7703 0.0222 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:48.8059373Z SingleProcess AUTOTUNE benchmarking takes 0.2700 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:01:48.9927584Z Autotune Choices Stats: 2025-09-07T10:01:48.9928566Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_7732", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.008511999621987343, "best_triton_pos": 0} 2025-09-07T10:01:49.0433120Z AUTOTUNE mm(25088x64, 64x64) 2025-09-07T10:01:49.0433378Z strides: [64, 1], [64, 1] 2025-09-07T10:01:49.0433623Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:49.0434264Z triton_mm_7732 0.0085 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:49.0435542Z triton_mm_7733 0.0086 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:49.0436518Z triton_mm_7734 0.0087 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:49.0437477Z triton_mm_7737 0.0087 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:49.0438440Z triton_mm_7736 0.0087 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:49.0439398Z triton_mm_7735 0.0088 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:49.0440364Z triton_mm_7742 0.0089 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:49.0441325Z triton_mm_7731 0.0090 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:01:49.0442294Z triton_mm_7738 0.0090 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:49.0443182Z triton_mm_7741 0.0091 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:49.0443974Z SingleProcess AUTOTUNE benchmarking takes 0.2286 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:01:49.2316834Z Autotune Choices Stats: 2025-09-07T10:01:49.2317774Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_mm_7769", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.006688000168651342, "best_triton_pos": 0} 2025-09-07T10:01:49.2485435Z AUTOTUNE mm(392x128, 128x64) 2025-09-07T10:01:49.2485682Z strides: [128, 1], [64, 1] 2025-09-07T10:01:49.2485947Z dtypes: torch.float16, torch.float16 2025-09-07T10:01:49.2486851Z triton_mm_7769 0.0067 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:49.2487831Z triton_mm_7761 0.0068 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:49.2488824Z triton_mm_7765 0.0068 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:01:49.2489800Z triton_mm_7760 0.0069 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:01:49.2490751Z triton_mm_7759 0.0069 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:01:49.2491711Z triton_mm_7768 0.0070 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:01:49.2492237Z mm 0.0071 ms 94.1% 2025-09-07T10:01:49.2492726Z triton_mm_7766 0.0071 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:01:49.2493557Z triton_mm_7767 0.0071 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:01:49.2494385Z triton_mm_7770 0.0071 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:01:49.2495391Z SingleProcess AUTOTUNE benchmarking takes 0.1921 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:02:06.5823338Z W0907 10:02:06.580000 129725 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:03:40.4049032Z pass 2025-09-07T10:03:51.2216448Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:03:51.2217529Z import pynvml # type: ignore[import] 2025-09-07T10:03:54.5742196Z 2025-09-07T10:03:56.0765875Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:03:56.0766244Z loading model: 0it [00:01, ?it/s] 2025-09-07T10:03:56.0766584Z cuda train visformer_small 2025-09-07T10:04:15.0617954Z Autotune Choices Stats: 2025-09-07T10:04:15.0619685Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.010400000028312206, "best_triton_pos": 1, "best_triton_time": 0.013088000006973743, "best_triton_kernel": "triton_convolution2d_17", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:04:15.1500354Z AUTOTUNE convolution(8x192x28x28, 384x192x1x1) 2025-09-07T10:04:15.1500753Z strides: [150528, 784, 28, 1], [192, 1, 1, 1] 2025-09-07T10:04:15.1501901Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:15.1502195Z convolution 0.0104 ms 100.0% 2025-09-07T10:04:15.1503005Z triton_convolution2d_17 0.0131 ms 79.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:15.1504662Z triton_convolution2d_18 0.0145 ms 71.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:15.1506458Z triton_convolution2d_16 0.0146 ms 71.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:15.1507775Z triton_convolution2d_19 0.0153 ms 68.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:15.1509094Z triton_convolution2d_13 0.0171 ms 60.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:15.1510389Z triton_convolution2d_14 0.0184 ms 56.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:15.1511687Z triton_convolution2d_15 0.0227 ms 45.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:04:15.1512577Z conv1x1_via_mm 0.0521 ms 20.0% 2025-09-07T10:04:15.1513086Z SingleProcess AUTOTUNE benchmarking takes 0.2262 seconds and 0.0003 seconds precompiling for 9 choices 2025-09-07T10:04:15.4573063Z Autotune Choices Stats: 2025-09-07T10:04:15.4574377Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_173", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.018592000007629395, "best_triton_pos": 0} 2025-09-07T10:04:15.4777207Z AUTOTUNE convolution(8x384x14x14, 1536x384x1x1) 2025-09-07T10:04:15.4777611Z strides: [75264, 196, 14, 1], [384, 1, 1, 1] 2025-09-07T10:04:15.4777935Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:15.4778769Z triton_convolution2d_173 0.0186 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:15.4780125Z triton_convolution2d_172 0.0213 ms 87.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:15.4780968Z convolution 0.0220 ms 84.6% 2025-09-07T10:04:15.4781830Z triton_convolution2d_175 0.0221 ms 84.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:15.4783254Z triton_convolution2d_174 0.0225 ms 82.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:15.4784571Z triton_convolution2d_170 0.0256 ms 72.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:15.4786027Z triton_convolution2d_169 0.0279 ms 66.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:15.4787912Z triton_convolution2d_171 0.0398 ms 46.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:04:15.4788723Z conv1x1_via_mm 0.0497 ms 37.4% 2025-09-07T10:04:15.4789522Z SingleProcess AUTOTUNE benchmarking takes 0.1615 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:04:15.6979710Z Autotune Choices Stats: 2025-09-07T10:04:15.6981007Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_12", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8", "best_time": 0.030527999624609947, "best_triton_pos": 0} 2025-09-07T10:04:15.7271843Z AUTOTUNE convolution(8x32x112x112, 192x32x4x4) 2025-09-07T10:04:15.7272219Z strides: [401408, 12544, 112, 1], [512, 16, 4, 1] 2025-09-07T10:04:15.7272646Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:15.7273565Z triton_convolution2d_12 0.0305 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:15.7275503Z triton_convolution2d_9 0.0306 ms 99.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:15.7276985Z triton_convolution2d_7 0.0349 ms 87.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:04:15.7278434Z triton_convolution2d_11 0.0350 ms 87.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:15.7279311Z convolution 0.0363 ms 84.2% 2025-09-07T10:04:15.7280154Z triton_convolution2d_10 0.0368 ms 83.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:04:15.7281590Z triton_convolution2d_6 0.0397 ms 76.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:04:15.7283030Z triton_convolution2d_8 0.1356 ms 22.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:04:15.7284071Z SingleProcess AUTOTUNE benchmarking takes 0.1435 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:04:16.1568363Z Autotune Choices Stats: 2025-09-07T10:04:16.1570288Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.011231999844312668, "best_triton_pos": 1, "best_triton_time": 0.016767999157309532, "best_triton_kernel": "triton_convolution2d_66", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:04:16.1727701Z AUTOTUNE convolution(8x384x28x28, 192x384x1x1) 2025-09-07T10:04:16.1728122Z strides: [301056, 784, 28, 1], [384, 1, 1, 1] 2025-09-07T10:04:16.1728515Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:16.1728874Z convolution 0.0112 ms 100.0% 2025-09-07T10:04:16.1729866Z triton_convolution2d_66 0.0168 ms 67.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:16.1732010Z triton_convolution2d_65 0.0195 ms 57.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:16.1733856Z triton_convolution2d_67 0.0204 ms 55.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:16.1735721Z triton_convolution2d_68 0.0204 ms 55.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:16.1737191Z triton_convolution2d_62 0.0254 ms 44.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:16.1738665Z triton_convolution2d_63 0.0276 ms 40.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:16.1740120Z triton_convolution2d_64 0.0322 ms 34.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:04:16.1741005Z conv1x1_via_mm 0.0520 ms 21.6% 2025-09-07T10:04:16.1741651Z SingleProcess AUTOTUNE benchmarking takes 0.1499 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:04:16.5517930Z Autotune Choices Stats: 2025-09-07T10:04:16.5519488Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.027583999559283257, "best_triton_pos": 1, "best_triton_time": 0.03167999908328056, "best_triton_kernel": "triton_convolution2d_434", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T10:04:16.5655358Z AUTOTUNE convolution(8x768x7x7, 3072x768x1x1) 2025-09-07T10:04:16.5655721Z strides: [37632, 49, 7, 1], [768, 1, 1, 1] 2025-09-07T10:04:16.5656027Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:16.5656314Z convolution 0.0276 ms 100.0% 2025-09-07T10:04:16.5657144Z triton_convolution2d_434 0.0317 ms 87.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:16.5658470Z triton_convolution2d_435 0.0407 ms 67.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:16.5659809Z triton_convolution2d_437 0.0453 ms 60.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:16.5661146Z triton_convolution2d_431 0.0459 ms 60.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:16.5662054Z conv1x1_via_mm 0.0501 ms 55.0% 2025-09-07T10:04:16.5662961Z triton_convolution2d_432 0.0553 ms 49.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:16.5664411Z triton_convolution2d_433 0.0682 ms 40.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=512, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:04:16.5666454Z triton_convolution2d_436 0.0831 ms 33.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:16.5667613Z SingleProcess AUTOTUNE benchmarking takes 0.1664 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:04:17.5155545Z Autotune Choices Stats: 2025-09-07T10:04:17.5157303Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_115", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.03017600066959858, "best_triton_pos": 0} 2025-09-07T10:04:17.5585538Z AUTOTUNE convolution(8x192x28x28, 384x192x2x2) 2025-09-07T10:04:17.5585956Z strides: [150528, 784, 28, 1], [768, 4, 2, 1] 2025-09-07T10:04:17.5586292Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:17.5587202Z triton_convolution2d_115 0.0302 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:04:17.5588084Z convolution 0.0315 ms 95.9% 2025-09-07T10:04:17.5588950Z triton_convolution2d_114 0.0346 ms 87.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:17.5590395Z triton_convolution2d_116 0.0351 ms 86.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:17.5591817Z triton_convolution2d_117 0.0363 ms 83.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:17.5593355Z triton_convolution2d_112 0.0431 ms 70.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:04:17.5594817Z triton_convolution2d_111 0.0632 ms 47.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:04:17.5596454Z triton_convolution2d_113 0.1023 ms 29.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:04:17.5597626Z SingleProcess AUTOTUNE benchmarking takes 0.1593 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:04:18.0002465Z Autotune Choices Stats: 2025-09-07T10:04:18.0004228Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03248000144958496, "best_triton_pos": 1, "best_triton_time": 0.04451199993491173, "best_triton_kernel": "triton_convolution2d_245", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:04:18.0532715Z AUTOTUNE convolution(8x1536x14x14, 384x1536x1x1) 2025-09-07T10:04:18.0533140Z strides: [301056, 196, 14, 1], [1536, 1, 1, 1] 2025-09-07T10:04:18.0533477Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:18.0533782Z convolution 0.0325 ms 100.0% 2025-09-07T10:04:18.0534538Z triton_convolution2d_245 0.0445 ms 73.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:18.0538375Z conv1x1_via_mm 0.0469 ms 69.3% 2025-09-07T10:04:18.0539151Z triton_convolution2d_246 0.0531 ms 61.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:18.0540377Z triton_convolution2d_244 0.0537 ms 60.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:18.0541965Z triton_convolution2d_247 0.0551 ms 58.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:18.0543330Z triton_convolution2d_242 0.0711 ms 45.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:18.0544623Z triton_convolution2d_241 0.0738 ms 44.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:18.0546003Z triton_convolution2d_243 0.1153 ms 28.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:04:18.0546972Z SingleProcess AUTOTUNE benchmarking takes 0.2163 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:04:18.6744215Z Autotune Choices Stats: 2025-09-07T10:04:18.6745952Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.03743999823927879, "best_triton_pos": 1, "best_triton_time": 0.04809600114822388, "best_triton_kernel": "triton_convolution2d_382", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:04:18.7234000Z AUTOTUNE convolution(8x384x14x14, 768x384x2x2) 2025-09-07T10:04:18.7234329Z strides: [75264, 196, 14, 1], [1536, 4, 2, 1] 2025-09-07T10:04:18.7234616Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:18.7234877Z convolution 0.0374 ms 100.0% 2025-09-07T10:04:18.7235776Z triton_convolution2d_382 0.0481 ms 77.8% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:04:18.7237008Z triton_convolution2d_381 0.0650 ms 57.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:18.7238243Z triton_convolution2d_383 0.0652 ms 57.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:18.7239462Z triton_convolution2d_384 0.0691 ms 54.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:18.7240679Z triton_convolution2d_379 0.0815 ms 45.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:04:18.7241895Z triton_convolution2d_380 0.1288 ms 29.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:04:18.7243108Z triton_convolution2d_378 0.1553 ms 24.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=2, KERNEL_W=2, PADDING_H=0, PADDING_W=0, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:04:18.7244625Z SingleProcess AUTOTUNE benchmarking takes 0.1963 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:04:19.2263136Z Autotune Choices Stats: 2025-09-07T10:04:19.2265304Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.026079999282956123, "best_triton_pos": 2, "best_triton_time": 0.09971199929714203, "best_triton_kernel": "triton_convolution2d_501", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T10:04:19.2666953Z AUTOTUNE convolution(8x3072x7x7, 768x3072x1x1) 2025-09-07T10:04:19.2667314Z strides: [150528, 49, 7, 1], [3072, 1, 1, 1] 2025-09-07T10:04:19.2667629Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:19.2667913Z convolution 0.0261 ms 100.0% 2025-09-07T10:04:19.2668191Z conv1x1_via_mm 0.0445 ms 58.6% 2025-09-07T10:04:19.2668978Z triton_convolution2d_501 0.0997 ms 26.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:19.2670246Z triton_convolution2d_502 0.1334 ms 19.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:19.2671500Z triton_convolution2d_504 0.1493 ms 17.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:19.2672758Z triton_convolution2d_498 0.1571 ms 16.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:19.2674137Z triton_convolution2d_499 0.1812 ms 14.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:19.2675549Z triton_convolution2d_500 0.2117 ms 12.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=512, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:04:19.2676775Z triton_convolution2d_503 0.3081 ms 8.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:19.2677749Z SingleProcess AUTOTUNE benchmarking takes 0.2818 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:04:19.9380108Z Autotune Choices Stats: 2025-09-07T10:04:19.9381294Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "triton_convolution2d_1", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4", "best_time": 0.03852799907326698, "best_triton_pos": 0} 2025-09-07T10:04:19.9557386Z AUTOTUNE convolution(8x3x224x224, 32x3x7x7) 2025-09-07T10:04:19.9557703Z strides: [150528, 50176, 224, 1], [147, 49, 7, 1] 2025-09-07T10:04:19.9558018Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:19.9558769Z triton_convolution2d_1 0.0385 ms 100.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:04:19.9559999Z triton_convolution2d_5 0.0388 ms 99.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:19.9561676Z triton_convolution2d_3 0.0510 ms 75.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:19.9562436Z convolution 0.0543 ms 70.9% 2025-09-07T10:04:19.9563427Z triton_convolution2d_0 0.0571 ms 67.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:04:19.9564653Z triton_convolution2d_4 0.0626 ms 61.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:04:19.9566385Z triton_convolution2d_2 0.0953 ms 40.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=7, KERNEL_W=7, PADDING_H=3, PADDING_W=3, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:04:19.9567301Z SingleProcess AUTOTUNE benchmarking takes 0.1244 seconds and 0.0002 seconds precompiling for 7 choices 2025-09-07T10:04:20.1111399Z Autotune Choices Stats: 2025-09-07T10:04:20.1112598Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_122", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4", "best_time": 0.01756799966096878, "best_triton_pos": 0} 2025-09-07T10:04:20.1606557Z AUTOTUNE convolution(8x384x14x14, 1152x384x1x1) 2025-09-07T10:04:20.1606912Z strides: [75264, 196, 14, 1], [384, 1, 1, 1] 2025-09-07T10:04:20.1607217Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:20.1607997Z triton_convolution2d_122 0.0176 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:20.1609305Z triton_convolution2d_121 0.0195 ms 90.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:20.1610580Z triton_convolution2d_123 0.0202 ms 86.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:20.1611799Z triton_convolution2d_124 0.0203 ms 86.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:20.1612539Z convolution 0.0219 ms 80.1% 2025-09-07T10:04:20.1613279Z triton_convolution2d_119 0.0241 ms 73.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:20.1614564Z triton_convolution2d_118 0.0253 ms 69.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:20.1615879Z triton_convolution2d_120 0.0392 ms 44.8% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:04:20.1616577Z conv1x1_via_mm 0.0417 ms 42.1% 2025-09-07T10:04:20.1617028Z SingleProcess AUTOTUNE benchmarking takes 0.1827 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:04:20.3896447Z Autotune Choices Stats: 2025-09-07T10:04:20.3897475Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_bmm_137", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.009600000455975533, "best_triton_pos": 0} 2025-09-07T10:04:20.4025719Z AUTOTUNE bmm(48x196x64, 48x64x196) 2025-09-07T10:04:20.4026038Z strides: [12544, 64, 1], [12544, 196, 1] 2025-09-07T10:04:20.4026332Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:20.4027375Z triton_bmm_137 0.0096 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:20.4028408Z triton_bmm_136 0.0098 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:20.4029380Z triton_bmm_133 0.0101 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:20.4030360Z triton_bmm_134 0.0103 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:20.4031343Z triton_bmm_143 0.0103 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:20.4032317Z triton_bmm_135 0.0103 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:20.4033281Z triton_bmm_138 0.0103 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:20.4034388Z triton_bmm_139 0.0104 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:20.4035769Z triton_bmm_131 0.0105 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:20.4036741Z triton_bmm_142 0.0105 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:20.4037603Z SingleProcess AUTOTUNE benchmarking takes 0.2414 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:04:20.6201861Z Autotune Choices Stats: 2025-09-07T10:04:20.6202845Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_bmm_155", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.010208000428974628, "best_triton_pos": 0} 2025-09-07T10:04:20.6431108Z AUTOTUNE bmm(48x196x196, 48x196x64) 2025-09-07T10:04:20.6431432Z strides: [38416, 196, 1], [12544, 64, 1] 2025-09-07T10:04:20.6431717Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:20.6432409Z triton_bmm_155 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:20.6433424Z triton_bmm_153 0.0105 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:20.6434523Z triton_bmm_154 0.0105 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:20.6435674Z triton_bmm_158 0.0105 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:20.6436909Z triton_bmm_151 0.0106 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:04:20.6437985Z triton_bmm_160 0.0109 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:20.6439148Z triton_bmm_161 0.0110 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:20.6440133Z triton_bmm_157 0.0111 ms 92.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:20.6441116Z triton_bmm_146 0.0111 ms 91.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:20.6442114Z triton_bmm_145 0.0116 ms 88.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:20.6442965Z SingleProcess AUTOTUNE benchmarking takes 0.2400 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:04:20.7741735Z Autotune Choices Stats: 2025-09-07T10:04:20.7743187Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.01414399966597557, "best_triton_pos": 1, "best_triton_time": 0.01600000075995922, "best_triton_kernel": "triton_convolution2d_166", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4"} 2025-09-07T10:04:21.0930854Z AUTOTUNE convolution(8x384x14x14, 384x384x1x1) 2025-09-07T10:04:21.0931266Z strides: [75264, 196, 14, 1], [384, 1, 1, 1] 2025-09-07T10:04:21.0931607Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:21.0931929Z convolution 0.0141 ms 100.0% 2025-09-07T10:04:21.0932795Z triton_convolution2d_166 0.0160 ms 88.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:21.0934211Z triton_convolution2d_165 0.0191 ms 74.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:21.0935522Z triton_convolution2d_167 0.0192 ms 73.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:21.0936659Z triton_convolution2d_168 0.0198 ms 71.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:21.0937808Z triton_convolution2d_163 0.0236 ms 60.1% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:21.0938943Z triton_convolution2d_162 0.0244 ms 57.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:21.0939631Z conv1x1_via_mm 0.0299 ms 47.3% 2025-09-07T10:04:21.0940316Z triton_convolution2d_164 0.0335 ms 42.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:04:21.0941228Z SingleProcess AUTOTUNE benchmarking takes 0.4495 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:04:21.3021351Z Autotune Choices Stats: 2025-09-07T10:04:21.3022881Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.027807999402284622, "best_triton_pos": 1, "best_triton_time": 0.03174399957060814, "best_triton_kernel": "triton_convolution2d_388", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T10:04:21.3547521Z AUTOTUNE convolution(8x768x7x7, 2304x768x1x1) 2025-09-07T10:04:21.3547876Z strides: [37632, 49, 7, 1], [768, 1, 1, 1] 2025-09-07T10:04:21.3548159Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:21.3548444Z convolution 0.0278 ms 100.0% 2025-09-07T10:04:21.3549194Z triton_convolution2d_388 0.0317 ms 87.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:21.3550443Z triton_convolution2d_389 0.0400 ms 69.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:21.3551188Z conv1x1_via_mm 0.0409 ms 68.1% 2025-09-07T10:04:21.3551926Z triton_convolution2d_391 0.0441 ms 63.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:21.3553147Z triton_convolution2d_385 0.0453 ms 61.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:21.3554493Z triton_convolution2d_386 0.0531 ms 52.3% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:21.3555925Z triton_convolution2d_387 0.0685 ms 40.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=512, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:04:21.3557139Z triton_convolution2d_390 0.0832 ms 33.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:21.3558107Z SingleProcess AUTOTUNE benchmarking takes 0.2034 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:04:21.5410248Z Autotune Choices Stats: 2025-09-07T10:04:21.5411173Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_bmm_393", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.008063999935984612, "best_triton_pos": 0} 2025-09-07T10:04:21.5806034Z AUTOTUNE bmm(48x49x128, 48x128x49) 2025-09-07T10:04:21.5806322Z strides: [6272, 128, 1], [6272, 49, 1] 2025-09-07T10:04:21.5806600Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:21.5807250Z triton_bmm_393 0.0081 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:21.5808257Z triton_bmm_400 0.0082 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:21.5809220Z triton_bmm_404 0.0084 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:21.5810201Z triton_bmm_396 0.0090 ms 89.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:21.5811564Z triton_bmm_394 0.0093 ms 86.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:21.5812165Z bmm 0.0095 ms 85.1% 2025-09-07T10:04:21.5812931Z triton_bmm_395 0.0097 ms 83.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:21.5813888Z triton_bmm_403 0.0100 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:21.5814867Z triton_bmm_406 0.0100 ms 80.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:21.5816129Z triton_bmm_401 0.0100 ms 80.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:21.5816907Z SingleProcess AUTOTUNE benchmarking takes 0.2249 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T10:04:21.7892265Z Autotune Choices Stats: 2025-09-07T10:04:21.7893358Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_bmm_421", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.007360000163316727, "best_triton_pos": 0} 2025-09-07T10:04:21.8139231Z AUTOTUNE bmm(48x49x49, 48x49x128) 2025-09-07T10:04:21.8139523Z strides: [2401, 49, 1], [6272, 128, 1] 2025-09-07T10:04:21.8139808Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:21.8140462Z triton_bmm_421 0.0074 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:21.8141475Z triton_bmm_414 0.0074 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:04:21.8142530Z triton_bmm_415 0.0076 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:21.8143497Z triton_bmm_409 0.0076 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:21.8144490Z triton_bmm_408 0.0077 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:21.8145498Z triton_bmm_423 0.0078 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:21.8146335Z triton_bmm_411 0.0078 ms 93.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:21.8147163Z triton_bmm_418 0.0079 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:21.8148005Z triton_bmm_419 0.0079 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:21.8148834Z triton_bmm_410 0.0081 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:21.8149941Z SingleProcess AUTOTUNE benchmarking takes 0.2328 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:04:21.9627546Z Autotune Choices Stats: 2025-09-07T10:04:21.9629405Z {"num_choices": 9, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.021088000386953354, "best_triton_pos": 2, "best_triton_time": 0.03081599995493889, "best_triton_kernel": "triton_convolution2d_427", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8"} 2025-09-07T10:04:22.0093091Z AUTOTUNE convolution(8x768x7x7, 768x768x1x1) 2025-09-07T10:04:22.0093404Z strides: [37632, 49, 7, 1], [768, 1, 1, 1] 2025-09-07T10:04:22.0093691Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:22.0093976Z convolution 0.0211 ms 100.0% 2025-09-07T10:04:22.0094227Z conv1x1_via_mm 0.0238 ms 88.7% 2025-09-07T10:04:22.0095614Z triton_convolution2d_427 0.0308 ms 68.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:22.0096919Z triton_convolution2d_428 0.0386 ms 54.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:22.0098158Z triton_convolution2d_430 0.0440 ms 47.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:22.0099369Z triton_convolution2d_424 0.0449 ms 46.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:22.0100580Z triton_convolution2d_425 0.0518 ms 40.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=4 2025-09-07T10:04:22.0101886Z triton_convolution2d_426 0.0592 ms 35.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=512, BLOCK_N=16, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=1, num_warps=8 2025-09-07T10:04:22.0103088Z triton_convolution2d_429 0.0844 ms 25.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=1, KERNEL_W=1, PADDING_H=0, PADDING_W=0, STRIDE_H=1, STRIDE_W=1, UNROLL=True, num_stages=2, num_warps=8 2025-09-07T10:04:22.0104050Z SingleProcess AUTOTUNE benchmarking takes 0.1948 seconds and 0.0002 seconds precompiling for 9 choices 2025-09-07T10:04:22.3095398Z Autotune Choices Stats: 2025-09-07T10:04:22.3096468Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_629", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.008960000239312649, "best_triton_pos": 0} 2025-09-07T10:04:22.3398615Z AUTOTUNE addmm(8x1000, 8x768, 768x1000) 2025-09-07T10:04:22.3398907Z strides: [0, 1], [768, 1], [1, 768] 2025-09-07T10:04:22.3399213Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:04:22.3399904Z triton_mm_629 0.0090 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:04:22.3400567Z bias_addmm 0.0095 ms 94.3% 2025-09-07T10:04:22.3401178Z triton_mm_633 0.0096 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:22.3402175Z triton_mm_637 0.0100 ms 89.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:22.3403588Z triton_mm_628 0.0108 ms 83.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:04:22.3404557Z triton_mm_627 0.0109 ms 82.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:22.3405916Z triton_mm_641 0.0111 ms 80.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:22.3406829Z triton_mm_632 0.0114 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:22.3407719Z triton_mm_626 0.0116 ms 76.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:04:22.3408615Z triton_mm_639 0.0124 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:22.3409404Z SingleProcess AUTOTUNE benchmarking takes 0.2684 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:04:36.4240280Z Autotune Choices Stats: 2025-09-07T10:04:36.4241662Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_661", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.006752000190317631, "best_triton_pos": 0} 2025-09-07T10:04:36.4474320Z AUTOTUNE mm(1000x8, 8x768) 2025-09-07T10:04:36.4474816Z strides: [1, 1000], [768, 1] 2025-09-07T10:04:36.4475805Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:36.4476881Z triton_mm_661 0.0068 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:36.4478622Z triton_mm_665 0.0068 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:04:36.4480170Z triton_mm_669 0.0068 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:36.4481695Z triton_mm_671 0.0068 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:36.4483099Z triton_mm_663 0.0068 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:36.4484612Z triton_mm_668 0.0070 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:36.4486370Z triton_mm_666 0.0070 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:36.4487889Z triton_mm_667 0.0070 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:36.4489389Z triton_mm_670 0.0070 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:36.4490791Z triton_mm_664 0.0070 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:36.4492630Z SingleProcess AUTOTUNE benchmarking takes 0.1887 seconds and 0.0004 seconds precompiling for 17 choices 2025-09-07T10:04:37.1341867Z Autotune Choices Stats: 2025-09-07T10:04:37.1343787Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "mm", "best_time": 0.009151999838650227, "best_triton_pos": 1, "best_triton_time": 0.009312000125646591, "best_triton_kernel": "triton_mm_646", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2"} 2025-09-07T10:04:37.1887968Z AUTOTUNE mm(8x1000, 1000x768) 2025-09-07T10:04:37.1888261Z strides: [1000, 1], [768, 1] 2025-09-07T10:04:37.1888520Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:37.1888802Z mm 0.0092 ms 100.0% 2025-09-07T10:04:37.1889435Z triton_mm_646 0.0093 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:04:37.1890505Z triton_mm_650 0.0098 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:37.1891510Z triton_mm_654 0.0101 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:37.1892515Z triton_mm_645 0.0115 ms 79.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:04:37.1893469Z triton_mm_644 0.0117 ms 78.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:37.1894432Z triton_mm_658 0.0117 ms 78.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:37.1895588Z triton_mm_649 0.0124 ms 73.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:37.1896557Z triton_mm_656 0.0131 ms 69.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:37.1897541Z triton_mm_653 0.0132 ms 69.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:37.1898452Z SingleProcess AUTOTUNE benchmarking takes 0.2383 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:04:37.9987051Z Autotune Choices Stats: 2025-09-07T10:04:37.9988293Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_bmm_676", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.007679999805986881, "best_triton_pos": 0} 2025-09-07T10:04:38.2057122Z AUTOTUNE bmm(48x49x49, 48x49x128) 2025-09-07T10:04:38.2057525Z strides: [2401, 1, 49], [6272, 1, 49] 2025-09-07T10:04:38.2057935Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:38.2058900Z triton_bmm_676 0.0077 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:38.2060393Z triton_bmm_679 0.0078 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:38.2061929Z triton_bmm_683 0.0078 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:38.2063729Z triton_bmm_678 0.0081 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:38.2065346Z triton_bmm_682 0.0081 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:04:38.2067022Z triton_bmm_691 0.0081 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:38.2068492Z triton_bmm_677 0.0083 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:38.2078918Z triton_bmm_686 0.0084 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:38.2080265Z triton_bmm_687 0.0085 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:38.2081461Z triton_bmm_675 0.0087 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:04:38.2082487Z SingleProcess AUTOTUNE benchmarking takes 1.0157 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:04:38.3630374Z Autotune Choices Stats: 2025-09-07T10:04:38.3631668Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_bmm_700", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.008736000396311283, "best_triton_pos": 0} 2025-09-07T10:04:38.4308268Z AUTOTUNE bmm(48x49x128, 48x128x49) 2025-09-07T10:04:38.4308695Z strides: [6272, 1, 49], [6272, 1, 128] 2025-09-07T10:04:38.4309092Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:38.4309957Z triton_bmm_700 0.0087 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:38.4311287Z triton_bmm_704 0.0087 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:38.4312585Z triton_bmm_693 0.0089 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:38.4313399Z bmm 0.0093 ms 94.1% 2025-09-07T10:04:38.4314146Z triton_bmm_694 0.0094 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:38.4315763Z triton_bmm_703 0.0098 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:38.4317057Z triton_bmm_699 0.0098 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:04:38.4318331Z triton_bmm_706 0.0099 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:38.4319611Z triton_bmm_701 0.0100 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:38.4320885Z triton_bmm_695 0.0104 ms 84.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:38.4322283Z SingleProcess AUTOTUNE benchmarking takes 0.2243 seconds and 0.0004 seconds precompiling for 16 choices 2025-09-07T10:04:38.6115724Z Autotune Choices Stats: 2025-09-07T10:04:38.6117067Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_bmm_719", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.007872000336647034, "best_triton_pos": 0} 2025-09-07T10:04:38.6454055Z AUTOTUNE bmm(48x128x49, 48x49x49) 2025-09-07T10:04:38.6454324Z strides: [6272, 1, 128], [2401, 49, 1] 2025-09-07T10:04:38.6454604Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:38.6455401Z triton_bmm_719 0.0079 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:38.6456397Z triton_bmm_718 0.0079 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:38.6457396Z triton_bmm_708 0.0081 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:38.6458356Z triton_bmm_714 0.0082 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:04:38.6459385Z triton_bmm_715 0.0082 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:38.6460429Z triton_bmm_711 0.0082 ms 95.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:38.6461562Z triton_bmm_710 0.0085 ms 92.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:38.6462636Z triton_bmm_709 0.0086 ms 91.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:38.6463702Z triton_bmm_723 0.0088 ms 89.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:38.6464771Z triton_bmm_724 0.0088 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:38.6465842Z SingleProcess AUTOTUNE benchmarking takes 0.2138 seconds and 0.0003 seconds precompiling for 19 choices 2025-09-07T10:04:38.8132135Z Autotune Choices Stats: 2025-09-07T10:04:38.8133717Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_bmm_732", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.007615999784320593, "best_triton_pos": 0} 2025-09-07T10:04:38.8562935Z AUTOTUNE bmm(48x49x49, 48x49x128) 2025-09-07T10:04:38.8563229Z strides: [2401, 49, 1], [6272, 1, 49] 2025-09-07T10:04:38.8563519Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:38.8564188Z triton_bmm_732 0.0076 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:04:38.8565352Z triton_bmm_726 0.0077 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:38.8566531Z triton_bmm_731 0.0079 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:38.8567639Z triton_bmm_738 0.0079 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:38.8568815Z triton_bmm_741 0.0079 ms 96.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:38.8569797Z triton_bmm_733 0.0080 ms 95.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:38.8570691Z triton_bmm_737 0.0080 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:38.8571598Z triton_bmm_729 0.0081 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:38.8572495Z triton_bmm_728 0.0081 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:38.8573391Z triton_bmm_736 0.0081 ms 94.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:38.8574170Z SingleProcess AUTOTUNE benchmarking takes 0.2103 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:04:39.3694523Z Autotune Choices Stats: 2025-09-07T10:04:39.3696359Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_bmm_957", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.01033599954098463, "best_triton_pos": 0} 2025-09-07T10:04:39.4279076Z AUTOTUNE bmm(48x196x196, 48x196x64) 2025-09-07T10:04:39.4279549Z strides: [38416, 1, 196], [12544, 1, 196] 2025-09-07T10:04:39.4279988Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:39.4280962Z triton_bmm_957 0.0103 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:39.4282493Z triton_bmm_954 0.0105 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:39.4283984Z triton_bmm_952 0.0106 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:39.4286154Z triton_bmm_950 0.0107 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:04:39.4287661Z triton_bmm_953 0.0107 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:39.4289167Z triton_bmm_956 0.0111 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:39.4290654Z triton_bmm_959 0.0111 ms 93.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:39.4292125Z triton_bmm_946 0.0114 ms 91.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:39.4294227Z triton_bmm_960 0.0114 ms 91.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:39.4296115Z triton_bmm_945 0.0118 ms 87.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:39.4297404Z SingleProcess AUTOTUNE benchmarking takes 0.5228 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:04:39.6598402Z Autotune Choices Stats: 2025-09-07T10:04:39.6599607Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_bmm_972", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.009983999654650688, "best_triton_pos": 0} 2025-09-07T10:04:39.6807592Z AUTOTUNE bmm(48x196x64, 48x64x196) 2025-09-07T10:04:39.6807912Z strides: [12544, 1, 196], [12544, 1, 64] 2025-09-07T10:04:39.6808213Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:39.6808929Z triton_bmm_972 0.0100 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:39.6810007Z triton_bmm_973 0.0101 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:39.6811042Z triton_bmm_969 0.0101 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:04:39.6812042Z triton_bmm_979 0.0101 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:39.6813020Z triton_bmm_970 0.0101 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:39.6814006Z triton_bmm_971 0.0102 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:39.6815428Z triton_bmm_977 0.0103 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:39.6816409Z triton_bmm_967 0.0105 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:39.6817379Z triton_bmm_978 0.0105 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:39.6818353Z triton_bmm_974 0.0106 ms 94.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:39.6819215Z SingleProcess AUTOTUNE benchmarking takes 0.2522 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:04:39.8790928Z Autotune Choices Stats: 2025-09-07T10:04:39.8791890Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_bmm_991", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.01017600018531084, "best_triton_pos": 0} 2025-09-07T10:04:40.0289581Z AUTOTUNE bmm(48x64x196, 48x196x196) 2025-09-07T10:04:40.0290086Z strides: [12544, 1, 64], [38416, 196, 1] 2025-09-07T10:04:40.0290503Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:40.0291964Z triton_bmm_991 0.0102 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:40.0293591Z triton_bmm_990 0.0102 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:40.0294880Z triton_bmm_987 0.0103 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:04:40.0297312Z triton_bmm_994 0.0104 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:40.0298756Z triton_bmm_993 0.0106 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:40.0300200Z triton_bmm_996 0.0107 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:40.0301755Z triton_bmm_983 0.0108 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:40.0302802Z triton_bmm_989 0.0108 ms 93.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:40.0304216Z triton_bmm_992 0.0110 ms 92.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:04:40.0305903Z triton_bmm_982 0.0115 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:40.0307143Z SingleProcess AUTOTUNE benchmarking takes 0.3471 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:04:40.2780088Z Autotune Choices Stats: 2025-09-07T10:04:40.2781649Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_bmm_1011", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8", "best_time": 0.010847999714314938, "best_triton_pos": 0} 2025-09-07T10:04:40.3026212Z AUTOTUNE bmm(48x196x196, 48x196x64) 2025-09-07T10:04:40.3026728Z strides: [38416, 196, 1], [12544, 1, 196] 2025-09-07T10:04:40.3027130Z dtypes: torch.float16, torch.float16 2025-09-07T10:04:40.3028073Z triton_bmm_1011 0.0108 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:40.3029604Z triton_bmm_1004 0.0109 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:04:40.3031162Z triton_bmm_1008 0.0110 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:40.3032653Z triton_bmm_1007 0.0113 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:04:40.3034154Z triton_bmm_1010 0.0113 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:40.3035901Z triton_bmm_1013 0.0116 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:04:40.3037794Z triton_bmm_1000 0.0119 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:40.3039547Z triton_bmm_1014 0.0120 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:40.3041521Z triton_bmm_999 0.0122 ms 89.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:04:40.3043051Z triton_bmm_1003 0.0124 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:04:40.3044364Z SingleProcess AUTOTUNE benchmarking takes 0.2725 seconds and 0.0005 seconds precompiling for 19 choices 2025-09-07T10:04:45.8566343Z W0907 10:04:45.855000 146086 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:05:03.2303423Z pass 2025-09-07T10:05:08.3487170Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:05:08.3488391Z import pynvml # type: ignore[import] 2025-09-07T10:05:11.5936032Z 2025-09-07T10:05:13.1860905Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:05:13.1861325Z loading model: 0it [00:01, ?it/s] 2025-09-07T10:05:13.1861621Z cuda train vit_base_patch16_224 2025-09-07T10:05:30.6501073Z Autotune Choices Stats: 2025-09-07T10:05:30.6502510Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.01865600049495697, "best_triton_pos": 1, "best_triton_time": 0.023711999878287315, "best_triton_kernel": "triton_mm_62", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:05:30.8005327Z AUTOTUNE addmm(1576x3072, 1576x768, 768x3072) 2025-09-07T10:05:30.8005715Z strides: [0, 1], [768, 1], [1, 768] 2025-09-07T10:05:30.8006044Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:05:30.8006367Z bias_addmm 0.0187 ms 100.0% 2025-09-07T10:05:30.8007056Z triton_mm_62 0.0237 ms 78.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:30.8008063Z triton_mm_56 0.0246 ms 75.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:30.8009041Z triton_mm_63 0.0275 ms 67.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:30.8010058Z triton_mm_55 0.0280 ms 66.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:30.8011038Z triton_mm_61 0.0282 ms 66.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:30.8011659Z addmm 0.0287 ms 65.1% 2025-09-07T10:05:30.8012248Z triton_mm_58 0.0306 ms 61.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:30.8013238Z triton_mm_57 0.0308 ms 60.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:05:30.8014634Z triton_mm_59 0.0313 ms 59.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:30.8015729Z SingleProcess AUTOTUNE benchmarking takes 0.9669 seconds and 0.0003 seconds precompiling for 21 choices 2025-09-07T10:05:34.2477907Z Autotune Choices Stats: 2025-09-07T10:05:34.2479875Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.130048006772995, "best_triton_pos": 1, "best_triton_time": 0.13177600502967834, "best_triton_kernel": "triton_convolution2d_6", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:05:34.2608888Z AUTOTUNE convolution(8x3x224x224, 768x3x16x16) 2025-09-07T10:05:34.2609233Z strides: [150528, 50176, 224, 1], [768, 256, 16, 1] 2025-09-07T10:05:34.2609560Z dtypes: torch.float16, torch.float16 2025-09-07T10:05:34.2609854Z convolution 0.1300 ms 100.0% 2025-09-07T10:05:34.2610636Z triton_convolution2d_6 0.1318 ms 98.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:05:34.2611906Z triton_convolution2d_3 0.1463 ms 88.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:05:34.2613143Z triton_convolution2d_1 0.1487 ms 87.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:05:34.2614406Z triton_convolution2d_4 0.1713 ms 75.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:05:34.2615941Z triton_convolution2d_5 0.1951 ms 66.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:05:34.2617113Z triton_convolution2d_0 0.2188 ms 59.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:05:34.2618255Z triton_convolution2d_2 0.4064 ms 32.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=16, KERNEL_W=16, PADDING_H=0, PADDING_W=0, STRIDE_H=16, STRIDE_W=16, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:05:34.2619158Z SingleProcess AUTOTUNE benchmarking takes 0.2307 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:05:34.8783133Z Autotune Choices Stats: 2025-09-07T10:05:34.8784798Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.015744000673294067, "best_triton_pos": 1, "best_triton_time": 0.01772800087928772, "best_triton_kernel": "triton_mm_24", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:05:35.2399094Z AUTOTUNE addmm(1576x2304, 1576x768, 768x2304) 2025-09-07T10:05:35.2399401Z strides: [0, 1], [768, 1], [1, 768] 2025-09-07T10:05:35.2399713Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:05:35.2400008Z bias_addmm 0.0157 ms 100.0% 2025-09-07T10:05:35.2400585Z triton_mm_24 0.0177 ms 88.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:35.2401500Z triton_mm_23 0.0204 ms 77.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:35.2410770Z triton_mm_25 0.0211 ms 74.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:35.2412037Z triton_mm_18 0.0212 ms 74.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:35.3274898Z triton_mm_16 0.0228 ms 69.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:35.3276497Z addmm 0.0231 ms 68.2% 2025-09-07T10:05:35.3277123Z triton_mm_14 0.0241 ms 65.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:05:35.3278109Z triton_mm_20 0.0241 ms 65.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:35.3279067Z triton_mm_17 0.0244 ms 64.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:35.3279937Z SingleProcess AUTOTUNE benchmarking takes 0.9779 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:05:35.5676787Z Autotune Choices Stats: 2025-09-07T10:05:35.5678745Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010688000358641148, "best_triton_pos": 1, "best_triton_time": 0.013055999763309956, "best_triton_kernel": "triton_mm_33", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T10:05:35.7314131Z AUTOTUNE mm(1576x768, 768x768) 2025-09-07T10:05:35.7314689Z strides: [768, 1], [1, 768] 2025-09-07T10:05:35.7315796Z dtypes: torch.float16, torch.float16 2025-09-07T10:05:35.7316218Z mm 0.0107 ms 100.0% 2025-09-07T10:05:35.7317121Z triton_mm_33 0.0131 ms 81.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:05:35.7318635Z triton_mm_34 0.0162 ms 66.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:05:35.7319956Z triton_mm_40 0.0195 ms 54.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:35.7321233Z triton_mm_30 0.0200 ms 53.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:05:35.7322719Z triton_mm_27 0.0206 ms 51.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:05:35.7324134Z triton_mm_32 0.0219 ms 48.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:05:35.7325945Z triton_mm_29 0.0221 ms 48.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:35.7327416Z triton_mm_41 0.0248 ms 43.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:05:35.7328905Z triton_mm_28 0.0251 ms 42.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:35.7330967Z SingleProcess AUTOTUNE benchmarking takes 0.4897 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:05:36.2502068Z Autotune Choices Stats: 2025-09-07T10:05:36.2503153Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_82", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.02396799996495247, "best_triton_pos": 0} 2025-09-07T10:05:36.2920603Z AUTOTUNE mm(1576x3072, 3072x768) 2025-09-07T10:05:36.2920851Z strides: [3072, 1], [1, 3072] 2025-09-07T10:05:36.2921079Z dtypes: torch.float16, torch.float16 2025-09-07T10:05:36.2921644Z triton_mm_82 0.0240 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:36.2922568Z triton_mm_76 0.0294 ms 81.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:05:36.2923424Z triton_mm_75 0.0297 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:36.2924263Z triton_mm_81 0.0312 ms 76.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:36.2925308Z triton_mm_71 0.0330 ms 72.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:05:36.2926147Z triton_mm_72 0.0335 ms 71.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:05:36.2926978Z triton_mm_74 0.0354 ms 67.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:36.2927810Z triton_mm_78 0.0356 ms 67.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:36.2928635Z triton_mm_73 0.0451 ms 53.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:36.2929455Z triton_mm_77 0.0454 ms 52.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:36.2930177Z SingleProcess AUTOTUNE benchmarking takes 0.5582 seconds and 0.0007 seconds precompiling for 20 choices 2025-09-07T10:05:49.5857359Z Autotune Choices Stats: 2025-09-07T10:05:49.5858721Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01833599992096424, "best_triton_pos": 1, "best_triton_time": 0.02006400004029274, "best_triton_kernel": "triton_mm_985", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:05:49.6108767Z AUTOTUNE mm(1576x768, 768x3072) 2025-09-07T10:05:49.6109251Z strides: [768, 1], [3072, 1] 2025-09-07T10:05:49.6109675Z dtypes: torch.float16, torch.float16 2025-09-07T10:05:49.6109953Z mm 0.0183 ms 100.0% 2025-09-07T10:05:49.6110552Z triton_mm_985 0.0201 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:49.6111532Z triton_mm_978 0.0211 ms 87.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:49.6112877Z triton_mm_986 0.0222 ms 82.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:49.6113853Z triton_mm_980 0.0231 ms 79.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:49.6115593Z triton_mm_979 0.0237 ms 77.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:49.6116647Z triton_mm_987 0.0249 ms 73.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:49.6117628Z triton_mm_982 0.0275 ms 66.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:49.6118610Z triton_mm_983 0.0276 ms 66.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:49.6119572Z triton_mm_981 0.0284 ms 64.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:05:49.6120301Z SingleProcess AUTOTUNE benchmarking takes 0.2794 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:05:50.8237411Z Autotune Choices Stats: 2025-09-07T10:05:50.8238709Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01711999997496605, "best_triton_pos": 1, "best_triton_time": 0.024032000452280045, "best_triton_kernel": "triton_mm_999", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:05:50.8661960Z AUTOTUNE mm(768x1576, 1576x3072) 2025-09-07T10:05:50.8662255Z strides: [1, 768], [3072, 1] 2025-09-07T10:05:50.8662518Z dtypes: torch.float16, torch.float16 2025-09-07T10:05:50.8662780Z mm 0.0171 ms 100.0% 2025-09-07T10:05:50.8663391Z triton_mm_999 0.0240 ms 71.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:50.8664446Z triton_mm_1005 0.0246 ms 69.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:50.8665934Z triton_mm_997 0.0276 ms 62.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:50.8666924Z triton_mm_1004 0.0284 ms 60.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:50.8667904Z triton_mm_998 0.0285 ms 60.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:50.8668866Z triton_mm_1001 0.0294 ms 58.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:50.8670038Z triton_mm_995 0.0295 ms 58.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:05:50.8671019Z triton_mm_1006 0.0297 ms 57.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:50.8672329Z triton_mm_1000 0.0301 ms 56.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:05:50.8673308Z SingleProcess AUTOTUNE benchmarking takes 0.7032 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:05:51.7780995Z Autotune Choices Stats: 2025-09-07T10:05:51.7782946Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01708799973130226, "best_triton_pos": 1, "best_triton_time": 0.02364799939095974, "best_triton_kernel": "triton_mm_1037", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:05:52.0431662Z AUTOTUNE mm(3072x1576, 1576x768) 2025-09-07T10:05:52.0431945Z strides: [1, 3072], [768, 1] 2025-09-07T10:05:52.0432203Z dtypes: torch.float16, torch.float16 2025-09-07T10:05:52.0432484Z mm 0.0171 ms 100.0% 2025-09-07T10:05:52.0433086Z triton_mm_1037 0.0236 ms 72.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:52.0434103Z triton_mm_1043 0.0247 ms 69.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:52.0435462Z triton_mm_1035 0.0269 ms 63.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:52.0436486Z triton_mm_1042 0.0275 ms 62.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:52.0437498Z triton_mm_1036 0.0282 ms 60.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:52.0438481Z triton_mm_1039 0.0282 ms 60.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:52.0439459Z triton_mm_1040 0.0292 ms 58.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:52.0440434Z triton_mm_1038 0.0295 ms 58.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:05:52.0441359Z triton_mm_1033 0.0296 ms 57.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:05:52.0442150Z SingleProcess AUTOTUNE benchmarking takes 0.9860 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:05:52.8015353Z Autotune Choices Stats: 2025-09-07T10:05:52.8016691Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.015936000272631645, "best_triton_pos": 1, "best_triton_time": 0.017952000722289085, "best_triton_kernel": "triton_mm_1120", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T10:05:52.8617469Z AUTOTUNE mm(2304x1576, 1576x768) 2025-09-07T10:05:52.8617779Z strides: [1, 2304], [768, 1] 2025-09-07T10:05:52.8618072Z dtypes: torch.float16, torch.float16 2025-09-07T10:05:52.8618361Z mm 0.0159 ms 100.0% 2025-09-07T10:05:52.8618994Z triton_mm_1120 0.0180 ms 88.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:52.8620027Z triton_mm_1113 0.0195 ms 81.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:52.8621512Z triton_mm_1119 0.0203 ms 78.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:52.8622418Z triton_mm_1112 0.0214 ms 74.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:52.8623560Z triton_mm_1114 0.0220 ms 72.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:05:52.8624504Z triton_mm_1116 0.0222 ms 71.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:52.8625623Z triton_mm_1109 0.0243 ms 65.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:05:52.8626533Z triton_mm_1111 0.0253 ms 62.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:52.8627431Z triton_mm_1118 0.0260 ms 61.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:52.8628224Z SingleProcess AUTOTUNE benchmarking takes 0.4950 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:05:53.7981323Z Autotune Choices Stats: 2025-09-07T10:05:53.7982478Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.011264000087976456, "best_triton_pos": 1, "best_triton_time": 0.013055999763309956, "best_triton_kernel": "triton_mm_1076", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T10:05:53.8852334Z AUTOTUNE mm(768x1576, 1576x768) 2025-09-07T10:05:53.8852599Z strides: [1, 768], [768, 1] 2025-09-07T10:05:53.8852833Z dtypes: torch.float16, torch.float16 2025-09-07T10:05:53.8853076Z mm 0.0113 ms 100.0% 2025-09-07T10:05:53.8853636Z triton_mm_1076 0.0131 ms 86.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:05:53.8854587Z triton_mm_1072 0.0161 ms 70.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:05:53.8855848Z triton_mm_1082 0.0161 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:53.8856800Z triton_mm_1075 0.0166 ms 68.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:53.8857706Z triton_mm_1071 0.0169 ms 66.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:05:53.8858629Z triton_mm_1074 0.0182 ms 62.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:53.8859531Z triton_mm_1081 0.0184 ms 61.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:53.8860431Z triton_mm_1078 0.0186 ms 60.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:53.8861840Z triton_mm_1068 0.0222 ms 50.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:05:53.8862623Z SingleProcess AUTOTUNE benchmarking takes 0.3087 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:05:55.2096770Z Autotune Choices Stats: 2025-09-07T10:05:55.2098593Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.018112000077962875, "best_triton_pos": 1, "best_triton_time": 0.024351999163627625, "best_triton_kernel": "triton_mm_1025", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T10:05:55.3404153Z AUTOTUNE mm(1576x3072, 3072x768) 2025-09-07T10:05:55.3404426Z strides: [3072, 1], [768, 1] 2025-09-07T10:05:55.3404716Z dtypes: torch.float16, torch.float16 2025-09-07T10:05:55.3405170Z mm 0.0181 ms 100.0% 2025-09-07T10:05:55.3405789Z triton_mm_1025 0.0244 ms 74.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:55.3406770Z triton_mm_1018 0.0284 ms 63.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:55.3407755Z triton_mm_1019 0.0291 ms 62.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:05:55.3408728Z triton_mm_1014 0.0298 ms 60.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:05:55.3409690Z triton_mm_1024 0.0305 ms 59.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:55.3410739Z triton_mm_1015 0.0317 ms 57.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:05:55.3411664Z triton_mm_1017 0.0337 ms 53.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:55.3412583Z triton_mm_1021 0.0343 ms 52.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:55.3413476Z triton_mm_1016 0.0422 ms 42.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:55.3414259Z SingleProcess AUTOTUNE benchmarking takes 0.4039 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:05:55.7233340Z Autotune Choices Stats: 2025-09-07T10:05:55.7234571Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010400000028312206, "best_triton_pos": 1, "best_triton_time": 0.012000000104308128, "best_triton_kernel": "triton_mm_1063", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T10:05:55.7535458Z AUTOTUNE mm(1576x768, 768x768) 2025-09-07T10:05:55.7535916Z strides: [768, 1], [768, 1] 2025-09-07T10:05:55.7536327Z dtypes: torch.float16, torch.float16 2025-09-07T10:05:55.7536751Z mm 0.0104 ms 100.0% 2025-09-07T10:05:55.7537745Z triton_mm_1063 0.0120 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:55.7539636Z triton_mm_1056 0.0124 ms 84.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:55.7541427Z triton_mm_1052 0.0125 ms 82.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:05:55.7542267Z triton_mm_1062 0.0131 ms 79.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:55.7543284Z triton_mm_1055 0.0132 ms 78.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:55.7544125Z triton_mm_1059 0.0137 ms 75.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:55.7545116Z triton_mm_1057 0.0140 ms 74.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:05:55.7545959Z triton_mm_1054 0.0151 ms 68.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:55.7546787Z triton_mm_1053 0.0156 ms 66.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:05:55.7547511Z SingleProcess AUTOTUNE benchmarking takes 0.4121 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:05:56.3649292Z Autotune Choices Stats: 2025-09-07T10:05:56.3651279Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.015359999611973763, "best_triton_pos": 1, "best_triton_time": 0.020479999482631683, "best_triton_kernel": "triton_mm_1101", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T10:05:56.4085738Z AUTOTUNE mm(1576x2304, 2304x768) 2025-09-07T10:05:56.4086014Z strides: [2304, 1], [768, 1] 2025-09-07T10:05:56.4086270Z dtypes: torch.float16, torch.float16 2025-09-07T10:05:56.4086540Z mm 0.0154 ms 100.0% 2025-09-07T10:05:56.4087158Z triton_mm_1101 0.0205 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:05:56.4088148Z triton_mm_1094 0.0229 ms 66.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:56.4089120Z triton_mm_1095 0.0240 ms 64.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:05:56.4090099Z triton_mm_1100 0.0242 ms 63.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:56.4091156Z triton_mm_1090 0.0242 ms 63.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:05:56.4092054Z triton_mm_1091 0.0265 ms 58.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:05:56.4092959Z triton_mm_1093 0.0268 ms 57.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:56.4093870Z triton_mm_1097 0.0271 ms 56.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:05:56.4095288Z triton_mm_1092 0.0332 ms 46.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:05:56.4096072Z SingleProcess AUTOTUNE benchmarking takes 0.6540 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:06:00.8330952Z W0907 10:06:00.831000 152816 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:06:21.3047869Z pass 2025-09-07T10:06:26.6353315Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:06:26.6354539Z import pynvml # type: ignore[import] 2025-09-07T10:06:29.6361391Z 2025-09-07T10:06:34.1725496Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:06:34.1725867Z loading model: 0it [00:04, ?it/s] 2025-09-07T10:06:34.1726175Z cuda train volo_d1_224 2025-09-07T10:07:05.0452731Z Autotune Choices Stats: 2025-09-07T10:07:05.0453821Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_106", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.012415999546647072, "best_triton_pos": 0} 2025-09-07T10:07:05.0649226Z AUTOTUNE addmm(6272x576, 6272x192, 192x576) 2025-09-07T10:07:05.0649554Z strides: [0, 1], [192, 1], [1, 192] 2025-09-07T10:07:05.0649858Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:07:05.0650574Z triton_mm_106 0.0124 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:05.0651558Z triton_mm_105 0.0127 ms 97.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:05.0652472Z triton_mm_98 0.0129 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:05.0653380Z triton_mm_95 0.0130 ms 95.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:07:05.0654275Z triton_mm_102 0.0131 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:05.0655422Z triton_mm_99 0.0136 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:05.0656321Z triton_mm_103 0.0138 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:05.0657216Z triton_mm_104 0.0140 ms 89.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:07:05.0658103Z triton_mm_100 0.0146 ms 85.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:05.0658675Z bias_addmm 0.0146 ms 84.9% 2025-09-07T10:07:05.0659111Z SingleProcess AUTOTUNE benchmarking takes 0.3060 seconds and 0.0004 seconds precompiling for 21 choices 2025-09-07T10:07:05.7090378Z Autotune Choices Stats: 2025-09-07T10:07:05.7092152Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.01065600011497736, "best_triton_pos": 1, "best_triton_time": 0.011392000131309032, "best_triton_kernel": "triton_mm_489", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:07:05.7244499Z AUTOTUNE addmm(1568x1152, 1568x384, 384x1152) 2025-09-07T10:07:05.7244877Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T10:07:05.7245680Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:07:05.7246534Z bias_addmm 0.0107 ms 100.0% 2025-09-07T10:07:05.7247237Z triton_mm_489 0.0114 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:05.7248299Z triton_mm_483 0.0114 ms 93.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:05.7249351Z triton_mm_490 0.0116 ms 92.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:05.7250446Z triton_mm_486 0.0118 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:05.7251476Z triton_mm_482 0.0119 ms 89.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:05.7252413Z triton_mm_479 0.0121 ms 88.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:05.7253397Z triton_mm_481 0.0127 ms 84.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:05.7254365Z triton_mm_485 0.0127 ms 83.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:05.7255560Z triton_mm_488 0.0132 ms 80.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:05.7256425Z SingleProcess AUTOTUNE benchmarking takes 0.3090 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:07:06.4344100Z Autotune Choices Stats: 2025-09-07T10:07:06.4346197Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.04255999997258186, "best_triton_pos": 1, "best_triton_time": 0.05711999908089638, "best_triton_kernel": "triton_convolution2d_26", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:07:06.6790140Z AUTOTUNE convolution(8x64x112x112, 192x64x4x4) 2025-09-07T10:07:06.6790768Z strides: [802816, 12544, 112, 1], [1024, 16, 4, 1] 2025-09-07T10:07:06.6791330Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:06.6791824Z convolution 0.0426 ms 100.0% 2025-09-07T10:07:06.6793197Z triton_convolution2d_26 0.0571 ms 74.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:07:06.6796142Z triton_convolution2d_23 0.0598 ms 71.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:07:06.6798428Z triton_convolution2d_21 0.0670 ms 63.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:07:06.6801289Z triton_convolution2d_24 0.0697 ms 61.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:07:06.6803421Z triton_convolution2d_25 0.0716 ms 59.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:07:06.6805556Z triton_convolution2d_20 0.0844 ms 50.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:07:06.6807390Z triton_convolution2d_22 0.2631 ms 16.2% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=4, KERNEL_W=4, PADDING_H=0, PADDING_W=0, STRIDE_H=4, STRIDE_W=4, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:07:06.6808816Z SingleProcess AUTOTUNE benchmarking takes 0.3966 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:07:07.0348880Z Autotune Choices Stats: 2025-09-07T10:07:07.0350264Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01190400030463934, "best_triton_pos": 1, "best_triton_time": 0.012415999546647072, "best_triton_kernel": "triton_mm_426", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T10:07:07.0913168Z AUTOTUNE mm(6272x576, 576x192) 2025-09-07T10:07:07.0913470Z strides: [576, 1], [1, 576] 2025-09-07T10:07:07.0913744Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:07.0914024Z mm 0.0119 ms 100.0% 2025-09-07T10:07:07.0914672Z triton_mm_426 0.0124 ms 95.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:07.0916347Z triton_mm_419 0.0127 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:07.0917389Z triton_mm_415 0.0128 ms 92.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:07.0918440Z triton_mm_425 0.0135 ms 88.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:07.0919488Z triton_mm_418 0.0139 ms 85.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:07.0920537Z triton_mm_422 0.0140 ms 85.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:07.0921585Z triton_mm_421 0.0153 ms 78.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:07.0922613Z triton_mm_417 0.0156 ms 76.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:07.0923658Z triton_mm_424 0.0160 ms 74.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:07.0924581Z SingleProcess AUTOTUNE benchmarking takes 0.2989 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:07:08.2944153Z Autotune Choices Stats: 2025-09-07T10:07:08.2946553Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_1579", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.0077760000713169575, "best_triton_pos": 0} 2025-09-07T10:07:08.3239825Z AUTOTUNE addmm(8x1152, 8x384, 384x1152) 2025-09-07T10:07:08.3240273Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T10:07:08.3240714Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:07:08.3242188Z triton_mm_1579 0.0078 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:08.3243699Z triton_mm_1583 0.0079 ms 98.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:08.3244639Z bias_addmm 0.0083 ms 93.5% 2025-09-07T10:07:08.3245725Z triton_mm_1578 0.0085 ms 91.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:08.3247172Z triton_mm_1582 0.0086 ms 90.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:08.3248615Z triton_mm_1591 0.0086 ms 90.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:08.3250065Z triton_mm_1577 0.0088 ms 88.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:08.3251502Z triton_mm_1589 0.0089 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:08.3252954Z triton_mm_1587 0.0089 ms 87.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:08.3254424Z triton_mm_1576 0.0091 ms 85.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:07:08.3255865Z SingleProcess AUTOTUNE benchmarking takes 0.2777 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:07:08.7734491Z Autotune Choices Stats: 2025-09-07T10:07:08.7736567Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_1521", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.0072639998979866505, "best_triton_pos": 0} 2025-09-07T10:07:08.7901202Z AUTOTUNE mm(8x384, 384x384) 2025-09-07T10:07:08.7901739Z strides: [384, 1], [1, 384] 2025-09-07T10:07:08.7902129Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:08.7903094Z triton_mm_1521 0.0073 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:08.7904595Z triton_mm_1525 0.0073 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:08.7905747Z mm 0.0074 ms 97.8% 2025-09-07T10:07:08.7906660Z triton_mm_1529 0.0078 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:08.7908148Z triton_mm_1519 0.0078 ms 93.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:08.7910104Z triton_mm_1520 0.0078 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:08.7911808Z triton_mm_1518 0.0080 ms 90.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:07:08.7913640Z triton_mm_1524 0.0080 ms 90.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:08.7915300Z triton_mm_1533 0.0081 ms 89.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:08.7916760Z triton_mm_1531 0.0084 ms 86.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:08.7918034Z SingleProcess AUTOTUNE benchmarking takes 0.2265 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:07:10.9219392Z Autotune Choices Stats: 2025-09-07T10:07:10.9220578Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "triton_convolution2d_12", "best_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8", "best_time": 0.058720000088214874, "best_triton_pos": 0} 2025-09-07T10:07:10.9683075Z AUTOTUNE convolution(8x64x112x112, 64x64x3x3) 2025-09-07T10:07:10.9683421Z strides: [802816, 12544, 112, 1], [576, 9, 3, 1] 2025-09-07T10:07:10.9683721Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:10.9684498Z triton_convolution2d_12 0.0587 ms 100.0% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:07:10.9686610Z triton_convolution2d_9 0.0605 ms 97.1% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:07:10.9687363Z convolution 0.0614 ms 95.6% 2025-09-07T10:07:10.9688100Z triton_convolution2d_10 0.0712 ms 82.5% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:07:10.9689303Z triton_convolution2d_11 0.0760 ms 77.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:07:10.9690500Z triton_convolution2d_7 0.0777 ms 75.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:07:10.9691731Z triton_convolution2d_6 0.0855 ms 68.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:07:10.9692934Z triton_convolution2d_8 0.1914 ms 30.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=1, STRIDE_W=1, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:07:10.9693830Z SingleProcess AUTOTUNE benchmarking takes 0.1907 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:07:11.2049582Z Autotune Choices Stats: 2025-09-07T10:07:11.2050801Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.009279999881982803, "best_triton_pos": 1, "best_triton_time": 0.009344000369310379, "best_triton_kernel": "triton_mm_34", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8"} 2025-09-07T10:07:11.2233575Z AUTOTUNE mm(6272x192, 192x192) 2025-09-07T10:07:11.2233819Z strides: [192, 1], [1, 192] 2025-09-07T10:07:11.2234066Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:11.2234329Z mm 0.0093 ms 100.0% 2025-09-07T10:07:11.2234910Z triton_mm_34 0.0093 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:11.2236299Z triton_mm_38 0.0094 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:11.2237267Z triton_mm_37 0.0095 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:11.2238238Z triton_mm_41 0.0096 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:11.2239194Z triton_mm_44 0.0098 ms 95.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:11.2240163Z triton_mm_36 0.0099 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:11.2241120Z triton_mm_45 0.0099 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:11.2242079Z triton_mm_40 0.0100 ms 92.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:11.2243014Z triton_mm_43 0.0102 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:11.2243798Z SingleProcess AUTOTUNE benchmarking takes 0.2496 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:07:11.7387721Z Autotune Choices Stats: 2025-09-07T10:07:11.7388761Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_53", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.008895999751985073, "best_triton_pos": 0} 2025-09-07T10:07:11.7905464Z AUTOTUNE addmm(1568x486, 1568x192, 192x486) 2025-09-07T10:07:11.7905764Z strides: [0, 1], [192, 1], [1, 192] 2025-09-07T10:07:11.7906097Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:07:11.7906792Z triton_mm_53 0.0089 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:11.7907787Z triton_mm_57 0.0093 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:11.7908750Z triton_mm_60 0.0093 ms 95.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:11.7909717Z triton_mm_56 0.0096 ms 92.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:11.7910698Z triton_mm_58 0.0098 ms 90.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:11.7911989Z triton_mm_55 0.0100 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:11.7913201Z triton_mm_59 0.0101 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:11.7914160Z triton_mm_48 0.0102 ms 87.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:11.7915737Z triton_mm_64 0.0102 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:11.7916735Z triton_mm_47 0.0104 ms 85.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:07:11.7917579Z SingleProcess AUTOTUNE benchmarking takes 0.5659 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:07:11.8684772Z Autotune Choices Stats: 2025-09-07T10:07:11.8685908Z {"num_choices": 6, "num_triton_choices": 5, "best_kernel": "triton_bmm_65", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.012032000347971916, "best_triton_pos": 0} 2025-09-07T10:07:11.9621659Z AUTOTUNE bmm(9408x9x9, 9408x9x32) 2025-09-07T10:07:11.9621977Z strides: [81, 9, 1], [288, 32, 1] 2025-09-07T10:07:11.9622263Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:11.9622954Z triton_bmm_65 0.0120 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:07:11.9623978Z triton_bmm_67 0.0121 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:11.9625275Z triton_bmm_68 0.0121 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T10:07:11.9626252Z triton_bmm_66 0.0122 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:07:11.9627213Z triton_bmm_69 0.0122 ms 98.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T10:07:11.9627854Z bmm 0.0392 ms 30.7% 2025-09-07T10:07:11.9628321Z SingleProcess AUTOTUNE benchmarking takes 0.1711 seconds and 0.0002 seconds precompiling for 6 choices 2025-09-07T10:07:12.2641154Z Autotune Choices Stats: 2025-09-07T10:07:12.2642480Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010048000141978264, "best_triton_pos": 1, "best_triton_time": 0.010528000071644783, "best_triton_kernel": "triton_mm_451", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:07:12.2985363Z AUTOTUNE mm(1568x384, 384x1152) 2025-09-07T10:07:12.2985625Z strides: [384, 1], [1, 384] 2025-09-07T10:07:12.2985883Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:12.2986153Z mm 0.0100 ms 100.0% 2025-09-07T10:07:12.2986780Z triton_mm_451 0.0105 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:12.2987755Z triton_mm_452 0.0105 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:12.2988958Z triton_mm_445 0.0107 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:12.2990035Z triton_mm_444 0.0112 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:12.2991199Z triton_mm_448 0.0113 ms 89.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:12.2992229Z triton_mm_441 0.0114 ms 88.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:12.2993309Z triton_mm_447 0.0118 ms 85.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:12.2994277Z triton_mm_443 0.0120 ms 83.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:12.2995403Z triton_mm_450 0.0123 ms 82.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:12.2996250Z SingleProcess AUTOTUNE benchmarking takes 0.2641 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:07:12.5375640Z Autotune Choices Stats: 2025-09-07T10:07:12.5376922Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.01033599954098463, "best_triton_pos": 1, "best_triton_time": 0.01196799986064434, "best_triton_kernel": "triton_mm_503", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T10:07:12.5679276Z AUTOTUNE mm(1568x1152, 1152x384) 2025-09-07T10:07:12.5679563Z strides: [1152, 1], [1, 1152] 2025-09-07T10:07:12.5679818Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:12.5680087Z mm 0.0103 ms 100.0% 2025-09-07T10:07:12.5680707Z triton_mm_503 0.0120 ms 86.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:12.5681714Z triton_mm_509 0.0136 ms 76.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:12.5682821Z triton_mm_498 0.0141 ms 73.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:12.5683901Z triton_mm_502 0.0143 ms 72.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:12.5684922Z triton_mm_499 0.0144 ms 71.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:12.5686207Z triton_mm_508 0.0160 ms 64.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:12.5687201Z triton_mm_505 0.0161 ms 64.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:12.5688164Z triton_mm_501 0.0161 ms 64.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:12.5689126Z triton_mm_495 0.0162 ms 64.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:12.5690433Z SingleProcess AUTOTUNE benchmarking takes 0.2631 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:07:12.9328542Z Autotune Choices Stats: 2025-09-07T10:07:12.9329861Z {"num_choices": 14, "num_triton_choices": 13, "best_kernel": "triton_bmm_1534", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.007648000027984381, "best_triton_pos": 0} 2025-09-07T10:07:12.9520292Z AUTOTUNE bmm(96x1x32, 96x32x197) 2025-09-07T10:07:12.9520560Z strides: [32, 0, 1], [6336, 197, 1] 2025-09-07T10:07:12.9520831Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:12.9521487Z triton_bmm_1534 0.0076 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:07:12.9522543Z triton_bmm_1539 0.0078 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:07:12.9523752Z triton_bmm_1536 0.0078 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:12.9524727Z triton_bmm_1540 0.0079 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:12.9526071Z triton_bmm_1544 0.0079 ms 96.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:12.9527044Z triton_bmm_1543 0.0083 ms 91.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:12.9528019Z triton_bmm_1541 0.0084 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:12.9528984Z triton_bmm_1542 0.0084 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:12.9529951Z triton_bmm_1545 0.0084 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:07:12.9530920Z triton_bmm_1546 0.0085 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:12.9531770Z SingleProcess AUTOTUNE benchmarking takes 0.1871 seconds and 0.0002 seconds precompiling for 14 choices 2025-09-07T10:07:13.0939854Z Autotune Choices Stats: 2025-09-07T10:07:13.0940875Z {"num_choices": 12, "num_triton_choices": 11, "best_kernel": "triton_bmm_1550", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.00800000037997961, "best_triton_pos": 0} 2025-09-07T10:07:13.1555194Z AUTOTUNE bmm(96x1x197, 96x197x32) 2025-09-07T10:07:13.1555474Z strides: [197, 18944, 1], [6336, 32, 1] 2025-09-07T10:07:13.1555754Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:13.1556425Z triton_bmm_1550 0.0080 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:13.1557428Z triton_bmm_1556 0.0082 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T10:07:13.1558274Z bmm 0.0082 ms 97.3% 2025-09-07T10:07:13.1558950Z triton_bmm_1557 0.0082 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:13.1559936Z triton_bmm_1548 0.0083 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:07:13.1561090Z triton_bmm_1553 0.0087 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T10:07:13.1562073Z triton_bmm_1549 0.0100 ms 80.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:13.1563057Z triton_bmm_1555 0.0101 ms 79.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T10:07:13.1564007Z triton_bmm_1554 0.0101 ms 79.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T10:07:13.1564912Z triton_bmm_1552 0.0103 ms 77.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:07:13.1565875Z SingleProcess AUTOTUNE benchmarking takes 0.2028 seconds and 0.0002 seconds precompiling for 12 choices 2025-09-07T10:07:13.3947807Z Autotune Choices Stats: 2025-09-07T10:07:13.3948731Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_1566", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.0074880002066493034, "best_triton_pos": 0} 2025-09-07T10:07:13.4397688Z AUTOTUNE addmm(8x384, 8x384, 384x384) 2025-09-07T10:07:13.4397980Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T10:07:13.4398274Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:07:13.4398970Z triton_mm_1566 0.0075 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:13.4399985Z triton_mm_1562 0.0076 ms 97.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:13.4400606Z bias_addmm 0.0082 ms 91.8% 2025-09-07T10:07:13.4401195Z triton_mm_1561 0.0083 ms 90.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:13.4402148Z triton_mm_1565 0.0083 ms 90.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:13.4403126Z triton_mm_1560 0.0084 ms 89.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:13.4404071Z triton_mm_1574 0.0084 ms 89.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:13.4405394Z triton_mm_1570 0.0084 ms 89.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:13.4406331Z triton_mm_1559 0.0087 ms 86.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:07:13.4407606Z triton_mm_1572 0.0087 ms 86.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:13.4408521Z SingleProcess AUTOTUNE benchmarking takes 0.2835 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:07:13.6511323Z Autotune Choices Stats: 2025-09-07T10:07:13.6512772Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_1596", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.00848000030964613, "best_triton_pos": 0} 2025-09-07T10:07:13.7716768Z AUTOTUNE mm(8x1152, 1152x384) 2025-09-07T10:07:13.7717066Z strides: [1152, 1], [1, 1152] 2025-09-07T10:07:13.7717324Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:13.7717982Z triton_mm_1596 0.0085 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:13.7718641Z mm 0.0086 ms 98.1% 2025-09-07T10:07:13.7719206Z triton_mm_1600 0.0090 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:13.7720169Z triton_mm_1604 0.0104 ms 81.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:13.7721137Z triton_mm_1608 0.0113 ms 75.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:13.7722082Z triton_mm_1595 0.0116 ms 73.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:13.7723047Z triton_mm_1594 0.0122 ms 69.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:13.7723976Z triton_mm_1599 0.0124 ms 68.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:13.7724855Z triton_mm_1593 0.0130 ms 65.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:07:13.7725906Z triton_mm_1603 0.0132 ms 64.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:13.7726726Z SingleProcess AUTOTUNE benchmarking takes 0.3313 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:07:14.0033104Z Autotune Choices Stats: 2025-09-07T10:07:14.0034171Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_1724", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.007296000141650438, "best_triton_pos": 0} 2025-09-07T10:07:14.0176860Z AUTOTUNE mm(8x384, 384x1000) 2025-09-07T10:07:14.0177133Z strides: [384, 1], [1, 384] 2025-09-07T10:07:14.0177400Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:14.0178089Z triton_mm_1724 0.0073 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:14.0179088Z triton_mm_1728 0.0075 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:14.0179705Z mm 0.0078 ms 93.8% 2025-09-07T10:07:14.0180293Z triton_mm_1723 0.0081 ms 90.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:07:14.0182009Z triton_mm_1736 0.0082 ms 88.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:14.0183028Z triton_mm_1722 0.0083 ms 87.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:14.0184203Z triton_mm_1732 0.0083 ms 87.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:14.0185245Z triton_mm_1727 0.0084 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:14.0186111Z triton_mm_1721 0.0085 ms 85.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:07:14.0186961Z triton_mm_1734 0.0087 ms 83.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:14.0187709Z SingleProcess AUTOTUNE benchmarking takes 0.2235 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:07:14.2794107Z Autotune Choices Stats: 2025-09-07T10:07:14.2795422Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_1748", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.011264000087976456, "best_triton_pos": 0} 2025-09-07T10:07:14.4907858Z AUTOTUNE addmm(1568x1000, 1568x384, 384x1000) 2025-09-07T10:07:14.4908202Z strides: [0, 1], [384, 1], [1, 384] 2025-09-07T10:07:14.4908522Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:07:14.4909230Z triton_mm_1748 0.0113 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:14.4909866Z bias_addmm 0.0114 ms 99.2% 2025-09-07T10:07:14.4910506Z triton_mm_1754 0.0114 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:14.4919181Z triton_mm_1755 0.0114 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:14.4920169Z triton_mm_1747 0.0116 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:14.4921156Z triton_mm_1744 0.0118 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:14.4922128Z triton_mm_1751 0.0118 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:14.4923100Z triton_mm_1750 0.0126 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:14.4924057Z triton_mm_1746 0.0126 ms 89.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:14.4924838Z triton_mm_1753 0.0130 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:14.4925933Z SingleProcess AUTOTUNE benchmarking takes 0.4724 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:07:20.8563699Z skipping cudagraphs due to disabling cudagraphs due to incompatible op aten.index_put.default Found from File "/var/lib/jenkins/workspace/benchmarks/dynamo/timm_models.py", line 442, in torch_dynamo_resume_in_forward_and_backward_pass_at_440 2025-09-07T10:07:20.8564788Z pred = mod(*cloned_inputs) 2025-09-07T10:07:20.8566013Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 822, in forward 2025-09-07T10:07:20.8566490Z x = self.forward_features(x) 2025-09-07T10:07:20.8566954Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 795, in forward_features 2025-09-07T10:07:20.8567431Z x = self.forward_tokens(x) 2025-09-07T10:07:20.8567869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 642, in forward_tokens 2025-09-07T10:07:20.8568328Z x = block(x) 2025-09-07T10:07:20.8568691Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 135, in forward 2025-09-07T10:07:20.8569153Z x = x + self.drop_path1(self.attn(self.norm1(x))) 2025-09-07T10:07:20.8569604Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/timm/models/volo.py", line 90, in forward 2025-09-07T10:07:20.8570225Z x = F.fold(x, output_size=(H, W), kernel_size=self.kernel_size, padding=self.padding, stride=self.stride) 2025-09-07T10:07:20.8570565Z 2025-09-07T10:07:20.8570568Z 2025-09-07T10:07:47.6093430Z Autotune Choices Stats: 2025-09-07T10:07:47.6094533Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_4408", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.011168000288307667, "best_triton_pos": 0} 2025-09-07T10:07:47.6463734Z AUTOTUNE mm(6272x192, 192x576) 2025-09-07T10:07:47.6464065Z strides: [192, 1], [576, 1] 2025-09-07T10:07:47.6464359Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:47.6465278Z triton_mm_4408 0.0112 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:47.6466328Z triton_mm_4416 0.0112 ms 99.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:47.6467326Z triton_mm_4415 0.0113 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:47.6467949Z mm 0.0118 ms 94.8% 2025-09-07T10:07:47.6468535Z triton_mm_4412 0.0118 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:47.6469530Z triton_mm_4405 0.0123 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:07:47.6470500Z triton_mm_4409 0.0123 ms 91.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:47.6471477Z triton_mm_4413 0.0126 ms 88.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:47.6472596Z triton_mm_4414 0.0128 ms 87.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:07:47.6473605Z triton_mm_4417 0.0133 ms 83.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:47.6475309Z SingleProcess AUTOTUNE benchmarking takes 0.8860 seconds and 0.0003 seconds precompiling for 20 choices 2025-09-07T10:07:49.2755874Z Autotune Choices Stats: 2025-09-07T10:07:49.2757864Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010048000141978264, "best_triton_pos": 1, "best_triton_time": 0.010432000271975994, "best_triton_kernel": "triton_mm_2282", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:07:49.4299411Z AUTOTUNE mm(1568x384, 384x1152) 2025-09-07T10:07:49.4299694Z strides: [384, 1], [1152, 1] 2025-09-07T10:07:49.4299949Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:49.4300193Z mm 0.0100 ms 100.0% 2025-09-07T10:07:49.4300754Z triton_mm_2282 0.0104 ms 96.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:49.4301799Z triton_mm_2289 0.0105 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:49.4302706Z triton_mm_2288 0.0108 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:49.4303623Z triton_mm_2281 0.0108 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:49.4304512Z triton_mm_2285 0.0110 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:49.4305620Z triton_mm_2278 0.0114 ms 88.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:49.4306532Z triton_mm_2280 0.0115 ms 87.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:49.4307419Z triton_mm_2284 0.0119 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:49.4308332Z triton_mm_2287 0.0121 ms 82.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:49.4309126Z SingleProcess AUTOTUNE benchmarking takes 1.3104 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:07:50.8040947Z Autotune Choices Stats: 2025-09-07T10:07:50.8042006Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_1846", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.006207999773323536, "best_triton_pos": 0} 2025-09-07T10:07:50.8241915Z AUTOTUNE mm(384x8, 8x1152) 2025-09-07T10:07:50.8242143Z strides: [1, 384], [1152, 1] 2025-09-07T10:07:50.8242387Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:50.8243036Z triton_mm_1846 0.0062 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:50.8244267Z triton_mm_1849 0.0064 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:07:50.8245623Z triton_mm_1844 0.0066 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:07:50.8247226Z triton_mm_1850 0.0066 ms 94.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:50.8248208Z triton_mm_1847 0.0066 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:50.8249469Z triton_mm_1853 0.0066 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:50.8250451Z triton_mm_1854 0.0066 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:50.8251444Z triton_mm_1855 0.0066 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:50.8252419Z triton_mm_1845 0.0066 ms 93.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:07:50.8253511Z triton_mm_1848 0.0068 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:50.8254308Z SingleProcess AUTOTUNE benchmarking takes 0.3774 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:07:52.8852794Z Autotune Choices Stats: 2025-09-07T10:07:52.8853999Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_1880", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.006335999816656113, "best_triton_pos": 0} 2025-09-07T10:07:52.8995546Z AUTOTUNE mm(1152x8, 8x384) 2025-09-07T10:07:52.8995826Z strides: [1, 1152], [384, 1] 2025-09-07T10:07:52.8996081Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:52.8996738Z triton_mm_1880 0.0063 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:52.8997737Z triton_mm_1883 0.0064 ms 99.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:52.8998703Z triton_mm_1882 0.0064 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:07:52.8999685Z triton_mm_1881 0.0066 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:52.9000657Z triton_mm_1885 0.0066 ms 96.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:52.9001626Z triton_mm_1879 0.0067 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:52.9002600Z triton_mm_1889 0.0067 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:52.9003575Z triton_mm_1878 0.0068 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:07:52.9004538Z triton_mm_1884 0.0068 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:53.0028156Z triton_mm_1888 0.0068 ms 93.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:53.0029025Z SingleProcess AUTOTUNE benchmarking takes 0.6076 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:07:53.2025351Z Autotune Choices Stats: 2025-09-07T10:07:53.2027433Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.011711999773979187, "best_triton_pos": 1, "best_triton_time": 0.011711999773979187, "best_triton_kernel": "triton_mm_2393", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:07:53.2579081Z AUTOTUNE mm(1152x1568, 1568x384) 2025-09-07T10:07:53.2579402Z strides: [1, 1152], [384, 1] 2025-09-07T10:07:53.2579696Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:53.2579988Z mm 0.0117 ms 100.0% 2025-09-07T10:07:53.2580660Z triton_mm_2393 0.0117 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:53.2581763Z triton_mm_2397 0.0139 ms 84.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:53.2582793Z triton_mm_2389 0.0163 ms 71.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:53.2583824Z triton_mm_2392 0.0167 ms 70.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:53.2584792Z triton_mm_2396 0.0171 ms 68.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:53.2585865Z triton_mm_2403 0.0177 ms 66.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:53.2586773Z triton_mm_2388 0.0178 ms 65.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:53.2587663Z triton_mm_2395 0.0179 ms 65.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:53.2588554Z triton_mm_2399 0.0180 ms 65.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:53.2589340Z SingleProcess AUTOTUNE benchmarking takes 0.2770 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:07:54.3187322Z Autotune Choices Stats: 2025-09-07T10:07:54.3188411Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_2298", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.012000000104308128, "best_triton_pos": 0} 2025-09-07T10:07:54.3751345Z AUTOTUNE mm(384x1568, 1568x1152) 2025-09-07T10:07:54.3751643Z strides: [1, 384], [1152, 1] 2025-09-07T10:07:54.3751943Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:54.3752649Z triton_mm_2298 0.0120 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:54.3753314Z mm 0.0121 ms 99.5% 2025-09-07T10:07:54.3753954Z triton_mm_2302 0.0136 ms 88.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:54.3755940Z triton_mm_2294 0.0164 ms 73.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:54.3756919Z triton_mm_2297 0.0171 ms 70.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:54.3758194Z triton_mm_2301 0.0173 ms 69.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:54.3759180Z triton_mm_2308 0.0177 ms 67.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:54.3760165Z triton_mm_2300 0.0179 ms 67.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:54.3761147Z triton_mm_2293 0.0180 ms 66.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:54.3762111Z triton_mm_2304 0.0181 ms 66.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:54.3762959Z SingleProcess AUTOTUNE benchmarking takes 0.9269 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:07:55.2811880Z Autotune Choices Stats: 2025-09-07T10:07:55.2813473Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010688000358641148, "best_triton_pos": 1, "best_triton_time": 0.013183999806642532, "best_triton_kernel": "triton_mm_1783", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:07:55.3251270Z AUTOTUNE mm(1000x1568, 1568x384) 2025-09-07T10:07:55.3252113Z strides: [1, 1000], [384, 1] 2025-09-07T10:07:55.3252647Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:55.3253117Z mm 0.0107 ms 100.0% 2025-09-07T10:07:55.3254212Z triton_mm_1783 0.0132 ms 81.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:55.3255824Z triton_mm_1787 0.0147 ms 72.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:55.3256970Z triton_mm_1782 0.0172 ms 62.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:55.3258097Z triton_mm_1786 0.0172 ms 62.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:55.3259225Z triton_mm_1785 0.0175 ms 61.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:55.3260352Z triton_mm_1789 0.0184 ms 58.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:55.3261539Z triton_mm_1778 0.0187 ms 57.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:55.3262660Z triton_mm_1792 0.0201 ms 53.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:55.3264287Z triton_mm_1793 0.0205 ms 52.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:55.3265340Z SingleProcess AUTOTUNE benchmarking takes 0.2654 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:07:55.8587092Z Autotune Choices Stats: 2025-09-07T10:07:55.8589451Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_1911", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.006111999973654747, "best_triton_pos": 0} 2025-09-07T10:07:55.8821752Z AUTOTUNE mm(384x8, 8x384) 2025-09-07T10:07:55.8822146Z strides: [1, 384], [384, 1] 2025-09-07T10:07:55.8822523Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:55.8823441Z triton_mm_1911 0.0061 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:07:55.8824851Z triton_mm_1912 0.0061 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:55.8826778Z triton_mm_1917 0.0061 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:55.8828156Z triton_mm_1910 0.0061 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:07:55.8829515Z triton_mm_1913 0.0061 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:55.8830861Z triton_mm_1914 0.0061 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:55.8832213Z triton_mm_1915 0.0061 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:07:55.8833552Z triton_mm_1916 0.0061 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:55.8834904Z triton_mm_1920 0.0063 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:55.8836432Z triton_mm_1918 0.0064 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:55.8837626Z SingleProcess AUTOTUNE benchmarking takes 0.2644 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:07:57.3012694Z Autotune Choices Stats: 2025-09-07T10:07:57.3014178Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.016287999227643013, "best_triton_pos": 1, "best_triton_time": 0.019231999292969704, "best_triton_kernel": "triton_mm_4422", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:07:57.3249816Z AUTOTUNE mm(192x6272, 6272x576) 2025-09-07T10:07:57.3250112Z strides: [1, 192], [576, 1] 2025-09-07T10:07:57.3250400Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:57.3250693Z mm 0.0163 ms 100.0% 2025-09-07T10:07:57.3251368Z triton_mm_4422 0.0192 ms 84.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:57.3253133Z triton_mm_4426 0.0212 ms 76.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:57.3254278Z triton_mm_4430 0.0251 ms 64.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:57.3255932Z triton_mm_4421 0.0406 ms 40.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:57.3257104Z triton_mm_4436 0.0407 ms 40.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:57.3258257Z triton_mm_4420 0.0420 ms 38.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:57.3259402Z triton_mm_4425 0.0436 ms 37.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:57.3260552Z triton_mm_4429 0.0441 ms 36.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:57.3261838Z triton_mm_4435 0.0508 ms 32.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:57.3262860Z SingleProcess AUTOTUNE benchmarking takes 0.8930 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:07:59.0311662Z Autotune Choices Stats: 2025-09-07T10:07:59.0313050Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.0161920003592968, "best_triton_pos": 1, "best_triton_time": 0.019360000267624855, "best_triton_kernel": "triton_mm_4460", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:07:59.1183145Z AUTOTUNE mm(576x6272, 6272x192) 2025-09-07T10:07:59.1183464Z strides: [1, 576], [192, 1] 2025-09-07T10:07:59.1183731Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:59.1184001Z mm 0.0162 ms 100.0% 2025-09-07T10:07:59.1184622Z triton_mm_4460 0.0194 ms 83.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:59.1185942Z triton_mm_4464 0.0210 ms 77.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:59.1186955Z triton_mm_4468 0.0249 ms 65.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:59.1187931Z triton_mm_4459 0.0400 ms 40.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:59.1188890Z triton_mm_4474 0.0402 ms 40.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:59.1189855Z triton_mm_4458 0.0421 ms 38.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:59.1190791Z triton_mm_4463 0.0430 ms 37.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:59.1192279Z triton_mm_4467 0.0439 ms 36.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:59.1193377Z triton_mm_4457 0.0495 ms 32.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:07:59.1194217Z SingleProcess AUTOTUNE benchmarking takes 0.9342 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:07:59.9827977Z Autotune Choices Stats: 2025-09-07T10:07:59.9829756Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_4545", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.01532800029963255, "best_triton_pos": 0} 2025-09-07T10:07:59.9969349Z AUTOTUNE mm(486x1568, 1568x192) 2025-09-07T10:07:59.9969668Z strides: [1, 486], [192, 1] 2025-09-07T10:07:59.9969975Z dtypes: torch.float16, torch.float16 2025-09-07T10:07:59.9970773Z triton_mm_4545 0.0153 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:59.9971987Z triton_mm_4546 0.0162 ms 94.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:07:59.9972737Z mm 0.0166 ms 92.3% 2025-09-07T10:07:59.9973453Z triton_mm_4551 0.0171 ms 89.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:59.9974657Z triton_mm_4550 0.0174 ms 87.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:07:59.9976622Z triton_mm_4554 0.0181 ms 84.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:07:59.9977800Z triton_mm_4547 0.0181 ms 84.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:07:59.9978958Z triton_mm_4553 0.0184 ms 83.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:59.9980134Z triton_mm_4555 0.0185 ms 82.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:07:59.9981373Z triton_mm_4557 0.0197 ms 77.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:07:59.9982381Z SingleProcess AUTOTUNE benchmarking takes 0.2899 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:08:03.1499240Z Autotune Choices Stats: 2025-09-07T10:08:03.1501228Z {"num_choices": 28, "num_triton_choices": 19, "best_kernel": "decompose_k_mm_4_split_1", "best_kernel_desc": "k_split=4", "best_time": 0.013632000423967838, "best_triton_pos": 4, "best_triton_time": 0.018592000007629395, "best_triton_kernel": "triton_mm_4479", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:08:03.2877135Z AUTOTUNE mm(192x6272, 6272x192) 2025-09-07T10:08:03.2877540Z strides: [1, 192], [192, 1] 2025-09-07T10:08:03.2877841Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:03.2878229Z decompose_k_mm_4_split_1 0.0136 ms 100.0% k_split=4 2025-09-07T10:08:03.2878656Z decompose_k_mm_7_split_2 0.0136 ms 100.0% k_split=7 2025-09-07T10:08:03.2879309Z mm 0.0138 ms 98.6% 2025-09-07T10:08:03.2879606Z decompose_k_mm_2_split_0 0.0143 ms 95.1% k_split=2 2025-09-07T10:08:03.2880576Z triton_mm_4479 0.0186 ms 73.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:03.2881794Z triton_mm_4483 0.0207 ms 65.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:03.2883288Z triton_mm_4487 0.0244 ms 56.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:08:03.2884148Z decompose_k_mm_16_split_6 0.0348 ms 39.2% k_split=16 2025-09-07T10:08:03.2884580Z decompose_k_mm_14_split_3 0.0348 ms 39.2% k_split=14 2025-09-07T10:08:03.2885370Z decompose_k_mm_8_split_5 0.0355 ms 38.4% k_split=8 2025-09-07T10:08:03.2886050Z SingleProcess AUTOTUNE benchmarking takes 2.8785 seconds and 0.0002 seconds precompiling for 28 choices 2025-09-07T10:08:03.7590976Z Autotune Choices Stats: 2025-09-07T10:08:03.7591993Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_1831", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.007199999876320362, "best_triton_pos": 0} 2025-09-07T10:08:03.8090552Z AUTOTUNE mm(8x384, 384x1152) 2025-09-07T10:08:03.8090842Z strides: [384, 1], [1152, 1] 2025-09-07T10:08:03.8091144Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:03.8091838Z triton_mm_1831 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:08:03.8092865Z triton_mm_1835 0.0075 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:03.8093516Z mm 0.0077 ms 93.8% 2025-09-07T10:08:03.8094089Z triton_mm_1829 0.0080 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:03.8095445Z triton_mm_1830 0.0080 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:08:03.8096437Z triton_mm_1834 0.0083 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:03.8097434Z triton_mm_1843 0.0083 ms 87.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:03.8098345Z triton_mm_1839 0.0083 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:08:03.8099262Z triton_mm_1828 0.0084 ms 85.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:08:03.8100166Z triton_mm_1838 0.0085 ms 84.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:03.8100957Z SingleProcess AUTOTUNE benchmarking takes 0.2205 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:08:04.6855634Z Autotune Choices Stats: 2025-09-07T10:08:04.6856692Z {"num_choices": 12, "num_triton_choices": 11, "best_kernel": "triton_bmm_1973", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2", "best_time": 0.012864000163972378, "best_triton_pos": 0} 2025-09-07T10:08:04.8154072Z AUTOTUNE bmm(96x1x197, 96x197x32) 2025-09-07T10:08:04.8154368Z strides: [197, 0, 1], [6336, 1, 197] 2025-09-07T10:08:04.8154637Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:04.8155427Z triton_bmm_1973 0.0129 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T10:08:04.8156758Z triton_bmm_1977 0.0129 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:08:04.8157394Z bmm 0.0131 ms 98.5% 2025-09-07T10:08:04.8157976Z triton_bmm_1968 0.0131 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:08:04.8158959Z triton_bmm_1976 0.0131 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T10:08:04.8159938Z triton_bmm_1969 0.0131 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:08:04.8160903Z triton_bmm_1970 0.0131 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:08:04.8161881Z triton_bmm_1972 0.0131 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:08:04.8162845Z triton_bmm_1975 0.0131 ms 98.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T10:08:04.8163816Z triton_bmm_1974 0.0132 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T10:08:04.8164668Z SingleProcess AUTOTUNE benchmarking takes 0.2572 seconds and 0.0002 seconds precompiling for 12 choices 2025-09-07T10:08:07.8566632Z Autotune Choices Stats: 2025-09-07T10:08:07.8568369Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010495999827980995, "best_triton_pos": 1, "best_triton_time": 0.011776000261306763, "best_triton_kernel": "triton_mm_1768", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T10:08:08.0080708Z AUTOTUNE mm(1568x1000, 1000x384) 2025-09-07T10:08:08.0081001Z strides: [1000, 1], [384, 1] 2025-09-07T10:08:08.0081293Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:08.0081584Z mm 0.0105 ms 100.0% 2025-09-07T10:08:08.0082206Z triton_mm_1768 0.0118 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:08:08.0083249Z triton_mm_1767 0.0138 ms 75.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:08.0084243Z triton_mm_1763 0.0142 ms 73.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:08:08.0085582Z triton_mm_1774 0.0147 ms 71.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:08.0086602Z triton_mm_1766 0.0149 ms 70.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:08.0088103Z triton_mm_1773 0.0155 ms 67.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:08.0088972Z triton_mm_1764 0.0156 ms 67.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:08.0090052Z triton_mm_1770 0.0156 ms 67.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:08.0090918Z triton_mm_1765 0.0180 ms 58.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:08.0091663Z SingleProcess AUTOTUNE benchmarking takes 1.2619 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:08:08.1968653Z Autotune Choices Stats: 2025-09-07T10:08:08.1970244Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "mm", "best_time": 0.008736000396311283, "best_triton_pos": 1, "best_triton_time": 0.008767999708652496, "best_triton_kernel": "triton_mm_1864", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2"} 2025-09-07T10:08:08.3983364Z AUTOTUNE mm(8x1152, 1152x384) 2025-09-07T10:08:08.3983755Z strides: [1152, 1], [384, 1] 2025-09-07T10:08:08.3984145Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:08.3984534Z mm 0.0087 ms 100.0% 2025-09-07T10:08:08.3985732Z triton_mm_1864 0.0088 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:08:08.3987250Z triton_mm_1868 0.0091 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:08.3988713Z triton_mm_1876 0.0111 ms 78.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:08.3990018Z triton_mm_1863 0.0116 ms 75.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:08:08.3991325Z triton_mm_1872 0.0117 ms 74.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:08:08.3992613Z triton_mm_1862 0.0118 ms 74.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:08.3993885Z triton_mm_1867 0.0126 ms 69.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:08.3995290Z triton_mm_1871 0.0132 ms 66.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:08.3996583Z triton_mm_1874 0.0133 ms 65.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:08:08.3997704Z SingleProcess AUTOTUNE benchmarking takes 0.3834 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:08:08.8577664Z Autotune Choices Stats: 2025-09-07T10:08:08.8578694Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_1897", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.007040000054985285, "best_triton_pos": 0} 2025-09-07T10:08:09.0911646Z AUTOTUNE mm(8x384, 384x384) 2025-09-07T10:08:09.0911902Z strides: [384, 1], [384, 1] 2025-09-07T10:08:09.0912144Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:09.0912805Z triton_mm_1897 0.0070 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:08:09.0914110Z triton_mm_1901 0.0072 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:09.0914723Z mm 0.0075 ms 93.6% 2025-09-07T10:08:09.0915633Z triton_mm_1895 0.0076 ms 92.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:09.0916667Z triton_mm_1896 0.0076 ms 92.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:08:09.0917670Z triton_mm_1905 0.0077 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:08:09.0918650Z triton_mm_1900 0.0079 ms 89.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:09.0919600Z triton_mm_1909 0.0080 ms 87.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:09.0920488Z triton_mm_1894 0.0081 ms 86.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:08:09.0921399Z triton_mm_1904 0.0081 ms 86.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:09.0922185Z SingleProcess AUTOTUNE benchmarking takes 0.6915 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:08:09.4548914Z Autotune Choices Stats: 2025-09-07T10:08:09.4549907Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_1933", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.006399999838322401, "best_triton_pos": 0} 2025-09-07T10:08:09.7528761Z AUTOTUNE bmm(96x197x1, 96x1x32) 2025-09-07T10:08:09.7529501Z strides: [197, 1, 197], [32, 32, 1] 2025-09-07T10:08:09.7529979Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:09.7530942Z triton_bmm_1933 0.0064 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:09.7532433Z triton_bmm_1935 0.0064 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:08:09.7533889Z triton_bmm_1927 0.0065 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:08:09.7535634Z triton_bmm_1929 0.0066 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:09.7537096Z triton_bmm_1931 0.0066 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:08:09.7538600Z triton_bmm_1936 0.0066 ms 96.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:09.7540196Z triton_bmm_1937 0.0067 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:09.7541598Z triton_bmm_1939 0.0067 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:09.7543105Z triton_bmm_1928 0.0067 ms 95.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:09.7544400Z triton_bmm_1930 0.0067 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:09.7545669Z SingleProcess AUTOTUNE benchmarking takes 0.6609 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T10:08:09.8812937Z Autotune Choices Stats: 2025-09-07T10:08:09.8814041Z {"num_choices": 14, "num_triton_choices": 13, "best_kernel": "triton_bmm_1941", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2", "best_time": 0.006943999789655209, "best_triton_pos": 0} 2025-09-07T10:08:09.9894542Z AUTOTUNE bmm(96x1x32, 96x32x197) 2025-09-07T10:08:09.9894917Z strides: [32, 32, 1], [6336, 1, 32] 2025-09-07T10:08:09.9895465Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:09.9896332Z triton_bmm_1941 0.0069 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:08:09.9897563Z triton_bmm_1943 0.0069 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:08:09.9898904Z triton_bmm_1942 0.0070 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:09.9900197Z triton_bmm_1945 0.0070 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:08:09.9901554Z triton_bmm_1946 0.0070 ms 99.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:09.9902843Z triton_bmm_1950 0.0070 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:08:09.9904123Z triton_bmm_1940 0.0071 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:08:09.9905558Z triton_bmm_1951 0.0071 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:08:09.9906863Z triton_bmm_1952 0.0071 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:09.9908163Z triton_bmm_1948 0.0072 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:09.9909250Z SingleProcess AUTOTUNE benchmarking takes 0.2360 seconds and 0.0002 seconds precompiling for 14 choices 2025-09-07T10:08:10.1308321Z Autotune Choices Stats: 2025-09-07T10:08:10.1309682Z {"num_choices": 15, "num_triton_choices": 14, "best_kernel": "triton_bmm_1953", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2", "best_time": 0.006976000033318996, "best_triton_pos": 0} 2025-09-07T10:08:10.4792016Z AUTOTUNE bmm(96x32x1, 96x1x197) 2025-09-07T10:08:10.4792317Z strides: [32, 1, 32], [197, 0, 1] 2025-09-07T10:08:10.4792592Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:10.4793705Z triton_bmm_1953 0.0070 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:08:10.4794776Z triton_bmm_1959 0.0070 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:10.4796012Z triton_bmm_1957 0.0071 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:08:10.4797038Z triton_bmm_1954 0.0071 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:08:10.4798033Z triton_bmm_1962 0.0071 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:08:10.4799146Z triton_bmm_1963 0.0071 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:10.4800224Z triton_bmm_1960 0.0071 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:10.4801292Z triton_bmm_1955 0.0072 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:10.4802370Z triton_bmm_1956 0.0072 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:10.4803429Z triton_bmm_1958 0.0072 ms 96.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:08:10.4804350Z SingleProcess AUTOTUNE benchmarking takes 0.4892 seconds and 0.0002 seconds precompiling for 15 choices 2025-09-07T10:08:10.9029506Z Autotune Choices Stats: 2025-09-07T10:08:10.9030967Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.010400000028312206, "best_triton_pos": 1, "best_triton_time": 0.011455999687314034, "best_triton_kernel": "triton_mm_2321", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T10:08:11.1134466Z AUTOTUNE mm(1568x1152, 1152x384) 2025-09-07T10:08:11.1135238Z strides: [1152, 1], [384, 1] 2025-09-07T10:08:11.1135698Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:11.1136167Z mm 0.0104 ms 100.0% 2025-09-07T10:08:11.1137164Z triton_mm_2321 0.0115 ms 90.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:08:11.1138888Z triton_mm_2327 0.0135 ms 77.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:11.1140339Z triton_mm_2320 0.0139 ms 74.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:11.1141834Z triton_mm_2316 0.0139 ms 74.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:08:11.1143149Z triton_mm_2317 0.0144 ms 72.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:11.1144315Z triton_mm_2319 0.0152 ms 68.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:11.1145828Z triton_mm_2326 0.0156 ms 66.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:11.1147005Z triton_mm_2323 0.0156 ms 66.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:11.1148169Z triton_mm_2313 0.0172 ms 60.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:11.1149191Z SingleProcess AUTOTUNE benchmarking takes 0.5878 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:08:12.0222744Z Autotune Choices Stats: 2025-09-07T10:08:12.0224085Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.011392000131309032, "best_triton_pos": 1, "best_triton_time": 0.012223999947309494, "best_triton_kernel": "triton_mm_4455", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T10:08:12.0381526Z AUTOTUNE mm(6272x576, 576x192) 2025-09-07T10:08:12.0381807Z strides: [576, 1], [192, 1] 2025-09-07T10:08:12.0382066Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:12.0382369Z mm 0.0114 ms 100.0% 2025-09-07T10:08:12.0382999Z triton_mm_4455 0.0122 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:12.0384025Z triton_mm_4448 0.0123 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:12.0385202Z triton_mm_4444 0.0125 ms 91.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:08:12.0386241Z triton_mm_4454 0.0131 ms 87.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:12.0387237Z triton_mm_4447 0.0134 ms 85.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:12.0388215Z triton_mm_4451 0.0137 ms 83.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:12.0389273Z triton_mm_4446 0.0146 ms 77.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:12.0390304Z triton_mm_4450 0.0152 ms 74.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:12.0391278Z triton_mm_4453 0.0154 ms 73.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:12.0392130Z SingleProcess AUTOTUNE benchmarking takes 0.7101 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:08:12.2331995Z Autotune Choices Stats: 2025-09-07T10:08:12.2333297Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_4501", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.009279999881982803, "best_triton_pos": 0} 2025-09-07T10:08:12.2941960Z AUTOTUNE mm(6272x192, 192x192) 2025-09-07T10:08:12.2942340Z strides: [192, 1], [192, 1] 2025-09-07T10:08:12.2942685Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:12.2943967Z triton_mm_4501 0.0093 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:08:12.2945743Z triton_mm_4505 0.0093 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:12.2946610Z mm 0.0093 ms 99.7% 2025-09-07T10:08:12.2947405Z triton_mm_4511 0.0095 ms 97.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:12.2948712Z triton_mm_4504 0.0096 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:12.2950019Z triton_mm_4508 0.0096 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:12.2951305Z triton_mm_4503 0.0097 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:12.2952594Z triton_mm_4512 0.0099 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:12.2953914Z triton_mm_4507 0.0100 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:12.2955314Z triton_mm_4510 0.0101 ms 91.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:12.2956464Z SingleProcess AUTOTUNE benchmarking takes 0.2536 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T10:08:12.8217674Z Autotune Choices Stats: 2025-09-07T10:08:12.8219207Z {"num_choices": 6, "num_triton_choices": 5, "best_kernel": "triton_bmm_4514", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2", "best_time": 0.01196799986064434, "best_triton_pos": 0} 2025-09-07T10:08:13.0349828Z AUTOTUNE bmm(9408x9x9, 9408x9x32) 2025-09-07T10:08:13.0350397Z strides: [81, 1, 9], [288, 32, 1] 2025-09-07T10:08:13.0350903Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:13.0352199Z triton_bmm_4514 0.0120 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:08:13.0354472Z triton_bmm_4515 0.0120 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:08:13.0357470Z triton_bmm_4516 0.0120 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=2 2025-09-07T10:08:13.0359876Z triton_bmm_4517 0.0120 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=2 2025-09-07T10:08:13.0362250Z triton_bmm_4513 0.0121 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:08:13.0363504Z bmm 0.0397 ms 30.1% 2025-09-07T10:08:13.0364258Z SingleProcess AUTOTUNE benchmarking takes 0.7396 seconds and 0.0002 seconds precompiling for 6 choices 2025-09-07T10:08:13.1419707Z Autotune Choices Stats: 2025-09-07T10:08:13.1422261Z {"num_choices": 7, "num_triton_choices": 6, "best_kernel": "triton_bmm_4523", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1", "best_time": 0.01228800043463707, "best_triton_pos": 0} 2025-09-07T10:08:13.2686101Z AUTOTUNE bmm(9408x9x32, 9408x32x9) 2025-09-07T10:08:13.2686489Z strides: [288, 32, 1], [288, 1, 32] 2025-09-07T10:08:13.2686819Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:13.2687635Z triton_bmm_4523 0.0123 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=1 2025-09-07T10:08:13.2688867Z triton_bmm_4522 0.0123 ms 99.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=1 2025-09-07T10:08:13.2690188Z triton_bmm_4520 0.0124 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=1 2025-09-07T10:08:13.2691495Z triton_bmm_4519 0.0125 ms 98.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T10:08:13.2692787Z triton_bmm_4518 0.0140 ms 87.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=1 2025-09-07T10:08:13.2694086Z triton_bmm_4521 0.0143 ms 85.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=16, BLOCK_N=16, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=1 2025-09-07T10:08:13.2694912Z bmm 0.0236 ms 52.0% 2025-09-07T10:08:13.2695724Z SingleProcess AUTOTUNE benchmarking takes 0.2326 seconds and 0.0005 seconds precompiling for 7 choices 2025-09-07T10:08:13.8795594Z Autotune Choices Stats: 2025-09-07T10:08:13.8796899Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "triton_mm_4532", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.010143999941647053, "best_triton_pos": 0} 2025-09-07T10:08:14.1485824Z AUTOTUNE mm(1568x486, 486x192) 2025-09-07T10:08:14.1486175Z strides: [486, 1], [192, 1] 2025-09-07T10:08:14.1486439Z dtypes: torch.float16, torch.float16 2025-09-07T10:08:14.1487139Z triton_mm_4532 0.0101 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:08:14.1488166Z triton_mm_4535 0.0106 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:14.1489140Z triton_mm_4531 0.0109 ms 92.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:08:14.1490202Z triton_mm_4534 0.0111 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:14.1491267Z triton_mm_4525 0.0112 ms 90.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:08:14.1492755Z triton_mm_4526 0.0115 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:08:14.1493969Z triton_mm_4536 0.0115 ms 88.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:08:14.1495541Z triton_mm_4538 0.0118 ms 85.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:08:14.1496643Z triton_mm_4533 0.0124 ms 81.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:08:14.1497313Z mm 0.0126 ms 80.3% 2025-09-07T10:08:14.1497791Z SingleProcess AUTOTUNE benchmarking takes 0.8790 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T10:08:33.6514697Z W0907 10:08:33.649000 156268 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:09:10.1436278Z pass 2025-09-07T10:09:17.6114473Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:09:17.6116218Z import pynvml # type: ignore[import] 2025-09-07T10:09:21.0388484Z 2025-09-07T10:09:24.1918837Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:09:24.1919182Z loading model: 0it [00:03, ?it/s] 2025-09-07T10:09:24.1919469Z cuda train xcit_large_24_p8_224 2025-09-07T10:10:19.2126822Z Autotune Choices Stats: 2025-09-07T10:10:19.2128189Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.05363199859857559, "best_triton_pos": 1, "best_triton_time": 0.06748799979686737, "best_triton_kernel": "triton_mm_127", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:10:19.2387248Z AUTOTUNE addmm(6272x3072, 6272x768, 768x3072) 2025-09-07T10:10:19.2387576Z strides: [0, 1], [768, 1], [1, 768] 2025-09-07T10:10:19.2387884Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:10:19.2388196Z bias_addmm 0.0536 ms 100.0% 2025-09-07T10:10:19.2388825Z triton_mm_127 0.0675 ms 79.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:19.2389811Z triton_mm_126 0.0760 ms 70.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:19.2390789Z triton_mm_128 0.0761 ms 70.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:19.2391756Z triton_mm_121 0.0800 ms 67.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:19.2392719Z triton_mm_120 0.0902 ms 59.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:19.2393673Z triton_mm_124 0.0917 ms 58.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:19.2394623Z triton_mm_123 0.0926 ms 57.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:19.2396447Z triton_mm_119 0.0929 ms 57.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:19.2397485Z triton_mm_122 0.0932 ms 57.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:10:19.2398281Z SingleProcess AUTOTUNE benchmarking takes 0.5120 seconds and 0.0004 seconds precompiling for 21 choices 2025-09-07T10:10:20.3217010Z Autotune Choices Stats: 2025-09-07T10:10:20.3218353Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.020447999238967896, "best_triton_pos": 1, "best_triton_time": 0.02518399991095066, "best_triton_kernel": "triton_mm_2660", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:10:20.3525355Z AUTOTUNE addmm(6280x768, 6280x768, 768x768) 2025-09-07T10:10:20.3525687Z strides: [0, 1], [768, 1], [1, 768] 2025-09-07T10:10:20.3526003Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:10:20.3526340Z bias_addmm 0.0204 ms 100.0% 2025-09-07T10:10:20.3526989Z triton_mm_2660 0.0252 ms 81.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:20.3528002Z triton_mm_2666 0.0253 ms 80.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:20.3528982Z triton_mm_2667 0.0285 ms 71.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:20.3529610Z addmm 0.0294 ms 69.6% 2025-09-07T10:10:20.3530205Z triton_mm_2659 0.0299 ms 68.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:20.3531184Z triton_mm_2662 0.0314 ms 65.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:20.3532167Z triton_mm_2661 0.0317 ms 64.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:10:20.3533156Z triton_mm_2663 0.0320 ms 64.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:20.3534131Z triton_mm_2665 0.0320 ms 63.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:20.3535153Z SingleProcess AUTOTUNE benchmarking takes 0.3316 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:10:21.3359678Z Autotune Choices Stats: 2025-09-07T10:10:21.3360739Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_2708", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.009600000455975533, "best_triton_pos": 0} 2025-09-07T10:10:21.3592935Z AUTOTUNE addmm(8x3072, 8x768, 768x3072) 2025-09-07T10:10:21.3593238Z strides: [0, 1], [768, 1], [1, 768] 2025-09-07T10:10:21.3593534Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:10:21.3594217Z triton_mm_2708 0.0096 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:10:21.3596075Z triton_mm_2712 0.0101 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:21.3596940Z bias_addmm 0.0106 ms 90.4% 2025-09-07T10:10:21.3597577Z triton_mm_2716 0.0112 ms 85.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:10:21.3598949Z triton_mm_2720 0.0114 ms 84.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:21.3600017Z triton_mm_2707 0.0120 ms 80.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:10:21.3601043Z triton_mm_2711 0.0121 ms 79.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:21.3602114Z triton_mm_2706 0.0122 ms 78.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:21.3603164Z triton_mm_2705 0.0126 ms 76.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:10:21.3604213Z triton_mm_2715 0.0132 ms 72.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:21.3605299Z SingleProcess AUTOTUNE benchmarking takes 0.2647 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:10:22.0296782Z Autotune Choices Stats: 2025-09-07T10:10:22.0297876Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "triton_mm_2636", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.008352000266313553, "best_triton_pos": 0} 2025-09-07T10:10:22.0534030Z AUTOTUNE addmm(8x768, 8x768, 768x768) 2025-09-07T10:10:22.0534415Z strides: [0, 1], [768, 1], [1, 768] 2025-09-07T10:10:22.0534745Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:10:22.0535672Z triton_mm_2636 0.0084 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:10:22.0536898Z triton_mm_2640 0.0087 ms 96.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:22.0537693Z bias_addmm 0.0091 ms 92.2% 2025-09-07T10:10:22.0538381Z triton_mm_2644 0.0097 ms 86.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:10:22.0539393Z triton_mm_2648 0.0100 ms 83.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:22.0540365Z triton_mm_2635 0.0104 ms 80.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:10:22.0541424Z triton_mm_2634 0.0107 ms 78.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:22.0542391Z triton_mm_2639 0.0110 ms 76.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:22.0543356Z triton_mm_2633 0.0113 ms 73.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:10:22.0544831Z triton_mm_2643 0.0118 ms 70.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:22.0545882Z SingleProcess AUTOTUNE benchmarking takes 0.4358 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:10:24.7599500Z Autotune Choices Stats: 2025-09-07T10:10:24.7601443Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.05331199988722801, "best_triton_pos": 1, "best_triton_time": 0.07648000121116638, "best_triton_kernel": "triton_convolution2d_0", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:10:24.8137116Z AUTOTUNE convolution(8x3x224x224, 192x3x3x3) 2025-09-07T10:10:24.8137455Z strides: [150528, 1, 672, 3], [27, 1, 9, 3] 2025-09-07T10:10:24.8137766Z dtypes: torch.float16, torch.float16 2025-09-07T10:10:24.8138055Z convolution 0.0533 ms 100.0% 2025-09-07T10:10:24.8138811Z triton_convolution2d_0 0.0765 ms 69.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:10:24.8140059Z triton_convolution2d_4 0.0796 ms 67.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:10:24.8141273Z triton_convolution2d_5 0.0851 ms 62.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:10:24.8142593Z triton_convolution2d_3 0.0924 ms 57.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:10:24.8143820Z triton_convolution2d_6 0.0976 ms 54.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:10:24.8145364Z triton_convolution2d_2 0.1074 ms 49.6% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:10:24.8146642Z triton_convolution2d_1 0.1464 ms 36.4% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:10:24.8147770Z SingleProcess AUTOTUNE benchmarking takes 0.2091 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:10:25.0448426Z Autotune Choices Stats: 2025-09-07T10:10:25.0449818Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.08870399743318558, "best_triton_pos": 1, "best_triton_time": 0.16927999258041382, "best_triton_kernel": "triton_convolution2d_10", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8"} 2025-09-07T10:10:25.0702528Z AUTOTUNE convolution(8x192x112x112, 384x192x3x3) 2025-09-07T10:10:25.0702876Z strides: [2408448, 1, 21504, 192], [1728, 1, 576, 192] 2025-09-07T10:10:25.0703211Z dtypes: torch.float16, torch.float16 2025-09-07T10:10:25.0703491Z convolution 0.0887 ms 100.0% 2025-09-07T10:10:25.0704240Z triton_convolution2d_10 0.1693 ms 52.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:10:25.0706319Z triton_convolution2d_11 0.1708 ms 51.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:10:25.0707747Z triton_convolution2d_12 0.1732 ms 51.2% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:10:25.0709162Z triton_convolution2d_13 0.1800 ms 49.3% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:10:25.0710383Z triton_convolution2d_7 0.2017 ms 44.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:10:25.0711606Z triton_convolution2d_8 0.2474 ms 35.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:10:25.0712821Z triton_convolution2d_9 0.7404 ms 12.0% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:10:25.0713784Z SingleProcess AUTOTUNE benchmarking takes 0.2549 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:10:25.3120367Z Autotune Choices Stats: 2025-09-07T10:10:25.3121753Z {"num_choices": 8, "num_triton_choices": 7, "best_kernel": "convolution", "best_time": 0.06560000032186508, "best_triton_pos": 1, "best_triton_time": 0.16518400609493256, "best_triton_kernel": "triton_convolution2d_14", "best_triton_kernel_desc": "ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4"} 2025-09-07T10:10:25.3388843Z AUTOTUNE convolution(8x384x56x56, 768x384x3x3) 2025-09-07T10:10:25.3389181Z strides: [1204224, 1, 21504, 384], [3456, 1, 1152, 384] 2025-09-07T10:10:25.3389495Z dtypes: torch.float16, torch.float16 2025-09-07T10:10:25.3389788Z convolution 0.0656 ms 100.0% 2025-09-07T10:10:25.3390535Z triton_convolution2d_14 0.1652 ms 39.7% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:10:25.3391765Z triton_convolution2d_18 0.1836 ms 35.7% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:10:25.3392993Z triton_convolution2d_17 0.2058 ms 31.9% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:10:25.3394222Z triton_convolution2d_19 0.2143 ms 30.6% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=256, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:10:25.3395766Z triton_convolution2d_20 0.2158 ms 30.4% ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=8 2025-09-07T10:10:25.3397001Z triton_convolution2d_15 0.2920 ms 22.5% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=256, BLOCK_N=64, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=2, num_warps=4 2025-09-07T10:10:25.3398217Z triton_convolution2d_16 0.8308 ms 7.9% ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=1024, BLOCK_N=16, GROUPS=1, KERNEL_H=3, KERNEL_W=3, PADDING_H=1, PADDING_W=1, STRIDE_H=2, STRIDE_W=2, UNROLL=False, num_stages=1, num_warps=8 2025-09-07T10:10:25.3399473Z SingleProcess AUTOTUNE benchmarking takes 0.2672 seconds and 0.0002 seconds precompiling for 8 choices 2025-09-07T10:10:25.6041213Z Autotune Choices Stats: 2025-09-07T10:10:25.6042579Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "triton_mm_25", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.00723200011998415, "best_triton_pos": 0} 2025-09-07T10:10:25.6624023Z AUTOTUNE addmm(784x768, 784x64, 64x768) 2025-09-07T10:10:25.6624362Z strides: [0, 1], [64, 1], [1, 64] 2025-09-07T10:10:25.6624687Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:10:25.6625553Z triton_mm_25 0.0072 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:25.6626559Z triton_mm_28 0.0074 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:10:25.6627703Z triton_mm_29 0.0074 ms 98.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:25.6628658Z triton_mm_22 0.0074 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=32, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:10:25.6629610Z triton_mm_32 0.0074 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:25.6630562Z triton_mm_30 0.0076 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:25.6631510Z triton_mm_31 0.0077 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:25.6632455Z triton_mm_33 0.0077 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:10:25.6633402Z triton_mm_27 0.0078 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:10:25.6634346Z triton_mm_34 0.0078 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:25.6635329Z SingleProcess AUTOTUNE benchmarking takes 0.3220 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:10:26.0857107Z Autotune Choices Stats: 2025-09-07T10:10:26.0858402Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.04310400038957596, "best_triton_pos": 1, "best_triton_time": 0.05375999957323074, "best_triton_kernel": "triton_mm_57", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:10:26.1280030Z AUTOTUNE addmm(6272x2304, 6272x768, 768x2304) 2025-09-07T10:10:26.1280360Z strides: [0, 1], [768, 1], [1, 768] 2025-09-07T10:10:26.1280669Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:10:26.1280985Z bias_addmm 0.0431 ms 100.0% 2025-09-07T10:10:26.1281620Z triton_mm_57 0.0538 ms 80.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:26.1282965Z triton_mm_58 0.0600 ms 71.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:26.1284051Z triton_mm_56 0.0612 ms 70.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:26.1285551Z triton_mm_51 0.0640 ms 67.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:26.1286771Z triton_mm_50 0.0704 ms 61.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:26.1287815Z triton_mm_54 0.0705 ms 61.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:26.1288719Z triton_mm_53 0.0713 ms 60.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:26.1289284Z addmm 0.0727 ms 59.3% 2025-09-07T10:10:26.1289810Z triton_mm_52 0.0727 ms 59.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:10:26.1290580Z SingleProcess AUTOTUNE benchmarking takes 0.4650 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:10:26.3183384Z Autotune Choices Stats: 2025-09-07T10:10:26.3184400Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_bmm_71", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.016704000532627106, "best_triton_pos": 0} 2025-09-07T10:10:26.3414847Z AUTOTUNE bmm(128x48x784, 128x784x48) 2025-09-07T10:10:26.3415451Z strides: [37632, 784, 1], [37632, 48, 1] 2025-09-07T10:10:26.3415794Z dtypes: torch.float16, torch.float16 2025-09-07T10:10:26.3416506Z triton_bmm_71 0.0167 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:10:26.3417181Z bmm 0.0169 ms 99.1% 2025-09-07T10:10:26.3417809Z triton_bmm_67 0.0169 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:26.3418685Z triton_bmm_73 0.0170 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:26.3419515Z triton_bmm_62 0.0176 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:26.3420357Z triton_bmm_66 0.0176 ms 94.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:10:26.3421198Z triton_bmm_70 0.0177 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:26.3422112Z triton_bmm_61 0.0186 ms 89.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:26.3422941Z triton_bmm_69 0.0187 ms 89.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:26.3423776Z triton_bmm_63 0.0190 ms 87.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:26.3425081Z SingleProcess AUTOTUNE benchmarking takes 0.2119 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T10:10:26.5518650Z Autotune Choices Stats: 2025-09-07T10:10:26.5519686Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_bmm_85", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.014047999866306782, "best_triton_pos": 0} 2025-09-07T10:10:26.5851028Z AUTOTUNE bmm(128x48x48, 128x48x784) 2025-09-07T10:10:26.5851334Z strides: [2304, 48, 1], [37632, 784, 1] 2025-09-07T10:10:26.5851621Z dtypes: torch.float16, torch.float16 2025-09-07T10:10:26.5852293Z triton_bmm_85 0.0140 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:26.5853311Z triton_bmm_90 0.0140 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:26.5854309Z triton_bmm_82 0.0141 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:26.5855701Z triton_bmm_86 0.0142 ms 98.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:10:26.5856688Z triton_bmm_83 0.0143 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:26.5857681Z triton_bmm_84 0.0148 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:26.5865566Z triton_bmm_87 0.0149 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:26.5866485Z triton_bmm_79 0.0153 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:10:26.5867340Z triton_bmm_88 0.0155 ms 90.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:26.5868188Z triton_bmm_80 0.0156 ms 90.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:10:26.5868911Z SingleProcess AUTOTUNE benchmarking takes 0.2431 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:10:27.0677410Z Autotune Choices Stats: 2025-09-07T10:10:27.0678943Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.018912000581622124, "best_triton_pos": 1, "best_triton_time": 0.022911999374628067, "best_triton_kernel": "triton_mm_107", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:10:27.1193074Z AUTOTUNE mm(6272x768, 768x768) 2025-09-07T10:10:27.1193397Z strides: [768, 1], [1, 768] 2025-09-07T10:10:27.1193684Z dtypes: torch.float16, torch.float16 2025-09-07T10:10:27.1193967Z mm 0.0189 ms 100.0% 2025-09-07T10:10:27.1194577Z triton_mm_107 0.0229 ms 82.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:27.1195924Z triton_mm_108 0.0239 ms 79.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:27.1197578Z triton_mm_102 0.0240 ms 78.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:27.1198578Z triton_mm_109 0.0261 ms 72.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:27.1199774Z triton_mm_101 0.0287 ms 65.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:27.1200682Z triton_mm_103 0.0294 ms 64.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:10:27.1201582Z triton_mm_98 0.0300 ms 63.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:10:27.1202478Z triton_mm_100 0.0300 ms 63.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:27.1203380Z triton_mm_104 0.0304 ms 62.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:27.1204167Z SingleProcess AUTOTUNE benchmarking takes 0.5336 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:10:27.6089775Z Autotune Choices Stats: 2025-09-07T10:10:27.6091069Z {"num_choices": 21, "num_triton_choices": 19, "best_kernel": "bias_addmm", "best_time": 0.04556800052523613, "best_triton_pos": 2, "best_triton_time": 0.0689919963479042, "best_triton_kernel": "triton_mm_140", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:10:27.6248622Z AUTOTUNE addmm(6272x768, 6272x3072, 3072x768) 2025-09-07T10:10:27.6248968Z strides: [0, 1], [3072, 1], [1, 3072] 2025-09-07T10:10:27.6249291Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:10:27.6249612Z bias_addmm 0.0456 ms 100.0% 2025-09-07T10:10:27.6249876Z addmm 0.0564 ms 80.8% 2025-09-07T10:10:27.6250497Z triton_mm_140 0.0690 ms 66.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:27.6251478Z triton_mm_146 0.0690 ms 66.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:27.6252470Z triton_mm_147 0.0696 ms 65.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:27.6253451Z triton_mm_141 0.0765 ms 59.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:10:27.6254412Z triton_mm_139 0.0975 ms 46.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:27.6255746Z triton_mm_143 0.1002 ms 45.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:10:27.6256716Z triton_mm_145 0.1003 ms 45.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:27.6257734Z triton_mm_137 0.1038 ms 43.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:27.6258965Z SingleProcess AUTOTUNE benchmarking takes 0.5031 seconds and 0.0002 seconds precompiling for 21 choices 2025-09-07T10:10:28.5169529Z Autotune Choices Stats: 2025-09-07T10:10:28.5171453Z {"num_choices": 19, "num_triton_choices": 17, "best_kernel": "bias_addmm", "best_time": 0.014592000283300877, "best_triton_pos": 1, "best_triton_time": 0.014592000283300877, "best_triton_kernel": "triton_mm_2729", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:10:28.5399479Z AUTOTUNE addmm(8x768, 8x3072, 3072x768) 2025-09-07T10:10:28.5399779Z strides: [0, 1], [3072, 1], [1, 3072] 2025-09-07T10:10:28.5400085Z dtypes: torch.float16, torch.float16, torch.float16 2025-09-07T10:10:28.5400421Z bias_addmm 0.0146 ms 100.0% 2025-09-07T10:10:28.5401084Z triton_mm_2729 0.0146 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:28.5402143Z triton_mm_2725 0.0150 ms 97.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:10:28.5403129Z triton_mm_2733 0.0170 ms 85.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:10:28.5403754Z addmm 0.0186 ms 78.4% 2025-09-07T10:10:28.5404344Z triton_mm_2737 0.0202 ms 72.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:10:28.5405724Z triton_mm_2724 0.0229 ms 63.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:10:28.5406716Z triton_mm_2723 0.0244 ms 59.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:10:28.5407687Z triton_mm_2728 0.0255 ms 57.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:10:28.5408719Z triton_mm_2722 0.0259 ms 56.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=2 2025-09-07T10:10:28.5409499Z SingleProcess AUTOTUNE benchmarking takes 0.2925 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:11:17.8139730Z Autotune Choices Stats: 2025-09-07T10:11:17.8141437Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.05180799961090088, "best_triton_pos": 1, "best_triton_time": 0.06060799956321716, "best_triton_kernel": "triton_mm_3326", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:11:17.8807358Z AUTOTUNE mm(6272x768, 768x3072) 2025-09-07T10:11:17.8807996Z strides: [768, 1], [3072, 1] 2025-09-07T10:11:17.8808565Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:17.8809130Z mm 0.0518 ms 100.0% 2025-09-07T10:11:17.8810502Z triton_mm_3326 0.0606 ms 85.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:17.8812642Z triton_mm_3327 0.0607 ms 85.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:17.8813942Z triton_mm_3328 0.0676 ms 76.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:17.8816111Z triton_mm_3321 0.0760 ms 68.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:17.8817442Z triton_mm_3319 0.0796 ms 65.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:17.8819040Z triton_mm_3320 0.0819 ms 63.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:17.8820393Z triton_mm_3322 0.0851 ms 60.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:17.8821791Z triton_mm_3323 0.0853 ms 60.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:17.8823023Z triton_mm_3324 0.0864 ms 60.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:17.8824030Z SingleProcess AUTOTUNE benchmarking takes 0.4913 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T10:11:18.9166837Z Autotune Choices Stats: 2025-09-07T10:11:18.9167969Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_2916", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.007391999941319227, "best_triton_pos": 0} 2025-09-07T10:11:18.9727197Z AUTOTUNE mm(768x8, 8x3072) 2025-09-07T10:11:18.9727480Z strides: [1, 768], [3072, 1] 2025-09-07T10:11:18.9727766Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:18.9728466Z triton_mm_2916 0.0074 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:11:18.9729450Z triton_mm_2917 0.0074 ms 99.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:11:18.9730433Z triton_mm_2918 0.0075 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:18.9731435Z triton_mm_2923 0.0075 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:18.9732507Z triton_mm_2919 0.0076 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:18.9733412Z triton_mm_2922 0.0076 ms 97.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:18.9734312Z triton_mm_2915 0.0076 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:18.9735695Z triton_mm_2921 0.0076 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:18.9736624Z triton_mm_2920 0.0076 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:18.9737533Z triton_mm_2924 0.0078 ms 94.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:11:18.9738753Z SingleProcess AUTOTUNE benchmarking takes 0.2181 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:11:19.1601988Z Autotune Choices Stats: 2025-09-07T10:11:19.1603463Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_2950", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8", "best_time": 0.0072639998979866505, "best_triton_pos": 0} 2025-09-07T10:11:19.2122029Z AUTOTUNE mm(3072x8, 8x768) 2025-09-07T10:11:19.2122461Z strides: [1, 3072], [768, 1] 2025-09-07T10:11:19.2122775Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:19.2123540Z triton_mm_2950 0.0073 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:11:19.2124610Z triton_mm_2949 0.0075 ms 97.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:11:19.2125790Z triton_mm_2953 0.0075 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:19.2126783Z triton_mm_2951 0.0076 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:19.2127752Z triton_mm_2955 0.0076 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:19.2128742Z triton_mm_2956 0.0076 ms 95.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:19.2129722Z triton_mm_2948 0.0076 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:19.2130719Z triton_mm_2954 0.0076 ms 95.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:19.2131714Z triton_mm_2952 0.0076 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:19.2132735Z triton_mm_2957 0.0079 ms 92.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=8 2025-09-07T10:11:19.2133544Z SingleProcess AUTOTUNE benchmarking takes 0.2157 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:11:20.0208715Z Autotune Choices Stats: 2025-09-07T10:11:20.0210165Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.04652800038456917, "best_triton_pos": 1, "best_triton_time": 0.06054399907588959, "best_triton_kernel": "triton_mm_3340", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:11:20.0436929Z AUTOTUNE mm(768x6272, 6272x3072) 2025-09-07T10:11:20.0437296Z strides: [1, 768], [3072, 1] 2025-09-07T10:11:20.0437592Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:20.0437889Z mm 0.0465 ms 100.0% 2025-09-07T10:11:20.0438543Z triton_mm_3340 0.0605 ms 76.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:20.0439629Z triton_mm_3346 0.0711 ms 65.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:20.0441130Z triton_mm_3339 0.0756 ms 61.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:20.0442198Z triton_mm_3341 0.0768 ms 60.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:20.0443509Z triton_mm_3343 0.0783 ms 59.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:20.0444570Z triton_mm_3347 0.0820 ms 56.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:20.0446031Z triton_mm_3338 0.0836 ms 55.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:20.0447085Z triton_mm_3345 0.0883 ms 52.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:20.0448139Z triton_mm_3342 0.0899 ms 51.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:20.0449067Z SingleProcess AUTOTUNE benchmarking takes 0.7768 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:11:20.4924718Z Autotune Choices Stats: 2025-09-07T10:11:20.4926518Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.04636799916625023, "best_triton_pos": 1, "best_triton_time": 0.06425599753856659, "best_triton_kernel": "triton_mm_3378", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:11:20.6216886Z AUTOTUNE mm(3072x6272, 6272x768) 2025-09-07T10:11:20.6217315Z strides: [1, 3072], [768, 1] 2025-09-07T10:11:20.6217667Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:20.6218028Z mm 0.0464 ms 100.0% 2025-09-07T10:11:20.6218896Z triton_mm_3378 0.0643 ms 72.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:20.6220403Z triton_mm_3384 0.0691 ms 67.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:20.6221955Z triton_mm_3377 0.0759 ms 61.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:20.6223411Z triton_mm_3379 0.0767 ms 60.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:20.6224857Z triton_mm_3381 0.0804 ms 57.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:20.6226471Z triton_mm_3376 0.0814 ms 57.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:20.6227905Z triton_mm_3383 0.0822 ms 56.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:20.6229357Z triton_mm_3380 0.0830 ms 55.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:20.6231234Z triton_mm_3385 0.0832 ms 55.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:20.6232430Z SingleProcess AUTOTUNE benchmarking takes 0.5720 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:11:21.4207454Z Autotune Choices Stats: 2025-09-07T10:11:21.4209264Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.03859199956059456, "best_triton_pos": 1, "best_triton_time": 0.04294399917125702, "best_triton_kernel": "triton_mm_3528", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8"} 2025-09-07T10:11:21.4366256Z AUTOTUNE mm(2304x6272, 6272x768) 2025-09-07T10:11:21.4366537Z strides: [1, 2304], [768, 1] 2025-09-07T10:11:21.4366815Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:21.4367093Z mm 0.0386 ms 100.0% 2025-09-07T10:11:21.4367706Z triton_mm_3528 0.0429 ms 89.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:21.4368725Z triton_mm_3522 0.0518 ms 74.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:21.4369736Z triton_mm_3521 0.0540 ms 71.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:21.4370701Z triton_mm_3527 0.0587 ms 65.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:21.4371674Z triton_mm_3524 0.0617 ms 62.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:21.4372682Z triton_mm_3520 0.0620 ms 62.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:21.4373753Z triton_mm_3517 0.0692 ms 55.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:11:21.4374656Z triton_mm_3518 0.0743 ms 52.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:21.4375885Z triton_mm_3519 0.0789 ms 48.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:21.4376668Z SingleProcess AUTOTUNE benchmarking takes 0.4139 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:11:21.9706515Z Autotune Choices Stats: 2025-09-07T10:11:21.9707578Z {"num_choices": 17, "num_triton_choices": 16, "best_kernel": "triton_mm_2982", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4", "best_time": 0.006527999881654978, "best_triton_pos": 0} 2025-09-07T10:11:22.0472021Z AUTOTUNE mm(768x8, 8x768) 2025-09-07T10:11:22.0472296Z strides: [1, 768], [768, 1] 2025-09-07T10:11:22.0472581Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:22.0473460Z triton_mm_2982 0.0065 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:11:22.0474780Z triton_mm_2986 0.0067 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:22.0476646Z triton_mm_2987 0.0067 ms 98.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:22.0477657Z triton_mm_2984 0.0067 ms 97.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:22.0478846Z triton_mm_2978 0.0067 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:11:22.0479806Z triton_mm_2983 0.0067 ms 97.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:11:22.0480783Z triton_mm_2977 0.0068 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=1, num_warps=2 2025-09-07T10:11:22.0481752Z triton_mm_2979 0.0068 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:22.0482716Z triton_mm_2989 0.0068 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:22.0483696Z triton_mm_2985 0.0068 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:22.0484458Z SingleProcess AUTOTUNE benchmarking takes 0.2372 seconds and 0.0002 seconds precompiling for 17 choices 2025-09-07T10:11:22.4554478Z Autotune Choices Stats: 2025-09-07T10:11:22.4556107Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.02521600015461445, "best_triton_pos": 1, "best_triton_time": 0.02723200060427189, "best_triton_kernel": "triton_mm_3398", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T10:11:22.6268156Z AUTOTUNE mm(768x6272, 6272x768) 2025-09-07T10:11:22.6268458Z strides: [1, 768], [768, 1] 2025-09-07T10:11:22.6268724Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:22.6269004Z mm 0.0252 ms 100.0% 2025-09-07T10:11:22.6269642Z triton_mm_3398 0.0272 ms 92.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:22.6270651Z triton_mm_3404 0.0380 ms 66.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:22.6271665Z triton_mm_3394 0.0386 ms 65.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:22.6272720Z triton_mm_3397 0.0477 ms 52.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:22.6274073Z triton_mm_3393 0.0477 ms 52.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:11:22.6275481Z triton_mm_3390 0.0481 ms 52.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:22.6276469Z triton_mm_3403 0.0524 ms 48.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:22.6277695Z triton_mm_3396 0.0551 ms 45.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:22.6278789Z triton_mm_3400 0.0551 ms 45.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:22.6279639Z SingleProcess AUTOTUNE benchmarking takes 0.5087 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:11:23.3084412Z Autotune Choices Stats: 2025-09-07T10:11:23.3086632Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.024927999824285507, "best_triton_pos": 1, "best_triton_time": 0.029983999207615852, "best_triton_kernel": "triton_mm_3024", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4"} 2025-09-07T10:11:23.4646846Z AUTOTUNE mm(768x6280, 6280x768) 2025-09-07T10:11:23.4647133Z strides: [1, 768], [768, 1] 2025-09-07T10:11:23.4647400Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:23.4647670Z mm 0.0249 ms 100.0% 2025-09-07T10:11:23.4648297Z triton_mm_3024 0.0300 ms 83.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:23.4649335Z triton_mm_3020 0.0395 ms 63.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:23.4650337Z triton_mm_3030 0.0412 ms 60.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:23.4651308Z triton_mm_3019 0.0487 ms 51.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:11:23.4652284Z triton_mm_3023 0.0491 ms 50.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:23.4653286Z triton_mm_3029 0.0541 ms 46.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:23.4654379Z triton_mm_3022 0.0554 ms 45.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:23.4655579Z triton_mm_3026 0.0559 ms 44.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:23.4656490Z triton_mm_3016 0.0592 ms 42.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:23.4657291Z SingleProcess AUTOTUNE benchmarking takes 0.4992 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:11:23.7053063Z Autotune Choices Stats: 2025-09-07T10:11:23.7054123Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_2898", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.00940799992531538, "best_triton_pos": 0} 2025-09-07T10:11:23.9524279Z AUTOTUNE mm(8x768, 768x3072) 2025-09-07T10:11:23.9524677Z strides: [768, 1], [3072, 1] 2025-09-07T10:11:23.9525448Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:23.9526155Z triton_mm_2898 0.0094 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:11:23.9527586Z triton_mm_2902 0.0097 ms 96.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:23.9528391Z mm 0.0098 ms 96.4% 2025-09-07T10:11:23.9528974Z triton_mm_2910 0.0104 ms 90.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:23.9530270Z triton_mm_2906 0.0108 ms 86.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:23.9531248Z triton_mm_2896 0.0113 ms 83.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:23.9532210Z triton_mm_2897 0.0113 ms 83.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:11:23.9533180Z triton_mm_2901 0.0118 ms 79.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:23.9534251Z triton_mm_2905 0.0120 ms 78.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:23.9535319Z triton_mm_2908 0.0122 ms 77.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:23.9536109Z SingleProcess AUTOTUNE benchmarking takes 0.4304 seconds and 0.0002 seconds precompiling for 18 choices 2025-09-07T10:11:25.8309559Z Autotune Choices Stats: 2025-09-07T10:11:25.8311345Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "mm", "best_time": 0.013728000223636627, "best_triton_pos": 1, "best_triton_time": 0.014688000082969666, "best_triton_kernel": "triton_mm_2935", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4"} 2025-09-07T10:11:25.9316430Z AUTOTUNE mm(8x3072, 3072x768) 2025-09-07T10:11:25.9316773Z strides: [3072, 1], [768, 1] 2025-09-07T10:11:25.9317051Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:25.9317335Z mm 0.0137 ms 100.0% 2025-09-07T10:11:25.9318018Z triton_mm_2935 0.0147 ms 93.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:25.9319091Z triton_mm_2931 0.0156 ms 87.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:11:25.9320161Z triton_mm_2939 0.0167 ms 82.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:25.9321232Z triton_mm_2943 0.0204 ms 67.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:25.9322272Z triton_mm_2930 0.0217 ms 63.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:11:25.9323308Z triton_mm_2929 0.0223 ms 61.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:25.9324345Z triton_mm_2934 0.0247 ms 55.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:25.9325807Z triton_mm_2938 0.0270 ms 50.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:25.9327290Z triton_mm_2941 0.0270 ms 50.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:25.9328211Z SingleProcess AUTOTUNE benchmarking takes 0.3440 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T10:11:26.1227328Z Autotune Choices Stats: 2025-09-07T10:11:26.1228931Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_mm_2964", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2", "best_time": 0.0080960001796484, "best_triton_pos": 0} 2025-09-07T10:11:26.2110065Z AUTOTUNE mm(8x768, 768x768) 2025-09-07T10:11:26.2110492Z strides: [768, 1], [768, 1] 2025-09-07T10:11:26.2110894Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:26.2111915Z triton_mm_2964 0.0081 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:11:26.2113385Z triton_mm_2968 0.0084 ms 96.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:26.2114279Z mm 0.0084 ms 96.2% 2025-09-07T10:11:26.2115359Z triton_mm_2976 0.0096 ms 84.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:26.2116784Z triton_mm_2962 0.0100 ms 81.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:26.2118199Z triton_mm_2963 0.0100 ms 81.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=32, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=2 2025-09-07T10:11:26.2119642Z triton_mm_2972 0.0100 ms 80.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:26.2121129Z triton_mm_2967 0.0106 ms 76.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:26.2122592Z triton_mm_2974 0.0111 ms 72.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=16, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:26.2124027Z triton_mm_2971 0.0111 ms 72.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=16, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:26.2125506Z SingleProcess AUTOTUNE benchmarking takes 0.2770 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T10:11:26.4755507Z Autotune Choices Stats: 2025-09-07T10:11:26.4757394Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.018880000337958336, "best_triton_pos": 1, "best_triton_time": 0.021247999742627144, "best_triton_kernel": "triton_mm_3009", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:11:26.6881339Z AUTOTUNE mm(6280x768, 768x768) 2025-09-07T10:11:26.6881700Z strides: [768, 1], [768, 1] 2025-09-07T10:11:26.6881981Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:26.6882271Z mm 0.0189 ms 100.0% 2025-09-07T10:11:26.6882931Z triton_mm_3009 0.0212 ms 88.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:26.6884358Z triton_mm_3002 0.0217 ms 87.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:26.6885712Z triton_mm_3004 0.0236 ms 80.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:26.6886684Z triton_mm_3010 0.0236 ms 80.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:26.6887862Z triton_mm_3003 0.0254 ms 74.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:26.6888843Z triton_mm_3011 0.0260 ms 72.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:26.6889821Z triton_mm_3006 0.0266 ms 71.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:26.6890789Z triton_mm_3007 0.0284 ms 66.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:26.6891755Z triton_mm_3005 0.0293 ms 64.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:26.6892598Z SingleProcess AUTOTUNE benchmarking takes 0.4762 seconds and 0.0004 seconds precompiling for 20 choices 2025-09-07T10:11:27.1707873Z Autotune Choices Stats: 2025-09-07T10:11:27.1709012Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.04652800038456917, "best_triton_pos": 1, "best_triton_time": 0.0613120011985302, "best_triton_kernel": "triton_mm_3364", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:11:27.3942292Z AUTOTUNE mm(6272x3072, 3072x768) 2025-09-07T10:11:27.3942584Z strides: [3072, 1], [768, 1] 2025-09-07T10:11:27.3942853Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:27.3943126Z mm 0.0465 ms 100.0% 2025-09-07T10:11:27.3943763Z triton_mm_3364 0.0613 ms 75.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:27.3944827Z triton_mm_3357 0.0658 ms 70.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:27.3946026Z triton_mm_3365 0.0662 ms 70.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:27.3946898Z triton_mm_3359 0.0675 ms 68.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:27.3947739Z triton_mm_3366 0.0691 ms 67.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:27.3948601Z triton_mm_3360 0.0715 ms 65.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:27.3949437Z triton_mm_3358 0.0856 ms 54.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:27.3950271Z triton_mm_3361 0.0926 ms 50.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:27.3951437Z triton_mm_3356 0.0964 ms 48.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:27.3952169Z SingleProcess AUTOTUNE benchmarking takes 0.6617 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:11:27.6482646Z Autotune Choices Stats: 2025-09-07T10:11:27.6484577Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.018880000337958336, "best_triton_pos": 1, "best_triton_time": 0.021344000473618507, "best_triton_kernel": "triton_mm_3421", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:11:27.8735522Z AUTOTUNE mm(6272x768, 768x768) 2025-09-07T10:11:27.8735924Z strides: [768, 1], [768, 1] 2025-09-07T10:11:27.8736206Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:27.8736496Z mm 0.0189 ms 100.0% 2025-09-07T10:11:27.8737132Z triton_mm_3421 0.0213 ms 88.5% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:27.8738157Z triton_mm_3414 0.0215 ms 87.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:27.8739153Z triton_mm_3416 0.0235 ms 80.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:27.8740141Z triton_mm_3422 0.0236 ms 79.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:27.8741142Z triton_mm_3415 0.0255 ms 73.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:27.8742231Z triton_mm_3423 0.0259 ms 72.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:27.8743211Z triton_mm_3418 0.0267 ms 70.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:27.8744179Z triton_mm_3419 0.0285 ms 66.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:27.8745304Z triton_mm_3417 0.0292 ms 64.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:27.8746047Z SingleProcess AUTOTUNE benchmarking takes 0.4758 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:11:28.0580526Z Autotune Choices Stats: 2025-09-07T10:11:28.0581657Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_bmm_3440", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8", "best_time": 0.01398400031030178, "best_triton_pos": 0} 2025-09-07T10:11:28.1121119Z AUTOTUNE bmm(128x48x48, 128x48x784) 2025-09-07T10:11:28.1121395Z strides: [2304, 1, 48], [37632, 784, 1] 2025-09-07T10:11:28.1121649Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:28.1122248Z triton_bmm_3440 0.0140 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:28.1123179Z triton_bmm_3435 0.0141 ms 99.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:28.1124527Z triton_bmm_3432 0.0141 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:28.1125891Z triton_bmm_3436 0.0142 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:28.1127030Z triton_bmm_3433 0.0142 ms 98.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:28.1127892Z triton_bmm_3434 0.0147 ms 95.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:28.1128754Z triton_bmm_3437 0.0147 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:28.1129596Z triton_bmm_3438 0.0153 ms 91.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:28.1130432Z triton_bmm_3430 0.0154 ms 91.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:11:28.1131271Z triton_bmm_3428 0.0154 ms 90.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:28.1131998Z SingleProcess AUTOTUNE benchmarking takes 0.2365 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T10:11:28.2857246Z Autotune Choices Stats: 2025-09-07T10:11:28.2858316Z {"num_choices": 16, "num_triton_choices": 15, "best_kernel": "triton_bmm_3453", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4", "best_time": 0.016767999157309532, "best_triton_pos": 0} 2025-09-07T10:11:28.3459973Z AUTOTUNE bmm(128x48x784, 128x784x48) 2025-09-07T10:11:28.3460435Z strides: [37632, 784, 1], [37632, 1, 784] 2025-09-07T10:11:28.3460904Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:28.3462129Z triton_bmm_3453 0.0168 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:28.3463179Z bmm 0.0169 ms 99.4% 2025-09-07T10:11:28.3464155Z triton_bmm_3449 0.0169 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:28.3466038Z triton_bmm_3455 0.0169 ms 99.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:28.3467030Z triton_bmm_3452 0.0178 ms 94.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:28.3468019Z triton_bmm_3448 0.0180 ms 93.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:11:28.3468997Z triton_bmm_3444 0.0185 ms 90.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:28.3469974Z triton_bmm_3443 0.0187 ms 89.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=32, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:28.3471165Z triton_bmm_3445 0.0191 ms 87.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:28.3472275Z triton_bmm_3442 0.0193 ms 86.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=32, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:11:28.3473127Z SingleProcess AUTOTUNE benchmarking takes 0.2333 seconds and 0.0002 seconds precompiling for 16 choices 2025-09-07T10:11:28.5369812Z Autotune Choices Stats: 2025-09-07T10:11:28.5370827Z {"num_choices": 19, "num_triton_choices": 18, "best_kernel": "triton_bmm_3469", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4", "best_time": 0.01360000018030405, "best_triton_pos": 0} 2025-09-07T10:11:28.5853356Z AUTOTUNE bmm(128x784x48, 128x48x48) 2025-09-07T10:11:28.5853637Z strides: [37632, 1, 784], [2304, 48, 1] 2025-09-07T10:11:28.5853926Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:28.5854595Z triton_bmm_3469 0.0136 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:28.5855928Z triton_bmm_3467 0.0137 ms 99.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:28.5856917Z triton_bmm_3464 0.0138 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:28.5857893Z triton_bmm_3468 0.0138 ms 98.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:28.5858882Z triton_bmm_3473 0.0138 ms 98.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:28.5859877Z triton_bmm_3472 0.0140 ms 97.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:28.5860856Z triton_bmm_3465 0.0141 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:28.5861902Z triton_bmm_3470 0.0145 ms 94.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:28.5862879Z triton_bmm_3462 0.0153 ms 89.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:11:28.5863851Z triton_bmm_3461 0.0154 ms 88.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=16, BLOCK_M=64, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=2, num_warps=4 2025-09-07T10:11:28.5864694Z SingleProcess AUTOTUNE benchmarking takes 0.2387 seconds and 0.0002 seconds precompiling for 19 choices 2025-09-07T10:11:28.7684260Z Autotune Choices Stats: 2025-09-07T10:11:28.7685733Z {"num_choices": 18, "num_triton_choices": 17, "best_kernel": "triton_bmm_3482", "best_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4", "best_time": 0.013919999822974205, "best_triton_pos": 0} 2025-09-07T10:11:28.8218894Z AUTOTUNE bmm(128x48x48, 128x48x784) 2025-09-07T10:11:28.8219193Z strides: [2304, 48, 1], [37632, 1, 48] 2025-09-07T10:11:28.8219481Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:28.8220153Z triton_bmm_3482 0.0139 ms 100.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:28.8221823Z triton_bmm_3485 0.0140 ms 99.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:28.8222817Z triton_bmm_3486 0.0142 ms 97.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:28.8224071Z triton_bmm_3490 0.0145 ms 96.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:28.8225358Z triton_bmm_3483 0.0147 ms 95.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:28.8226335Z triton_bmm_3484 0.0147 ms 94.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:28.8227253Z triton_bmm_3487 0.0148 ms 94.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:28.8228174Z triton_bmm_3488 0.0152 ms 91.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:28.8229085Z triton_bmm_3481 0.0152 ms 91.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=64, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=8 2025-09-07T10:11:28.8230005Z triton_bmm_3478 0.0157 ms 88.4% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=32, EVEN_K=False, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=4 2025-09-07T10:11:28.8230787Z SingleProcess AUTOTUNE benchmarking takes 0.2357 seconds and 0.0003 seconds precompiling for 18 choices 2025-09-07T10:11:29.1965904Z Autotune Choices Stats: 2025-09-07T10:11:29.1967211Z {"num_choices": 20, "num_triton_choices": 19, "best_kernel": "mm", "best_time": 0.038176000118255615, "best_triton_pos": 1, "best_triton_time": 0.04899200052022934, "best_triton_kernel": "triton_mm_3507", "best_triton_kernel_desc": "ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4"} 2025-09-07T10:11:29.2876132Z AUTOTUNE mm(6272x2304, 2304x768) 2025-09-07T10:11:29.2876482Z strides: [2304, 1], [768, 1] 2025-09-07T10:11:29.2876763Z dtypes: torch.float16, torch.float16 2025-09-07T10:11:29.2877055Z mm 0.0382 ms 100.0% 2025-09-07T10:11:29.2877689Z triton_mm_3507 0.0490 ms 77.9% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:29.2878736Z triton_mm_3500 0.0510 ms 74.8% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:29.2879759Z triton_mm_3502 0.0518 ms 73.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:29.2880757Z triton_mm_3508 0.0523 ms 73.1% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:29.2881737Z triton_mm_3509 0.0545 ms 70.0% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=64, BLOCK_M=128, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=5, num_warps=8 2025-09-07T10:11:29.2882712Z triton_mm_3503 0.0576 ms 66.2% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=128, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=4 2025-09-07T10:11:29.2884155Z triton_mm_3501 0.0613 ms 62.3% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=64, BLOCK_N=128, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:29.2885800Z triton_mm_3504 0.0674 ms 56.7% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=3, num_warps=4 2025-09-07T10:11:29.2887038Z triton_mm_3505 0.0712 ms 53.6% ACC_TYPE='tl.float32', ALLOW_TF32=False, BLOCK_K=32, BLOCK_M=128, BLOCK_N=64, EVEN_K=True, GROUP_M=8, USE_FAST_ACCUM=False, num_stages=4, num_warps=8 2025-09-07T10:11:29.2887891Z SingleProcess AUTOTUNE benchmarking takes 0.4652 seconds and 0.0002 seconds precompiling for 20 choices 2025-09-07T10:11:54.9640711Z W0907 10:11:54.963000 167463 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:13:50.1949383Z pass_due_to_skip 2025-09-07T10:14:00.9845747Z accuracy pass_rate=92.31% 2025-09-07T10:14:00.9851699Z calls_captured gmean=1265.66x mean=1732.077x 2025-09-07T10:14:00.9855328Z unique_graphs gmean=2.73x mean=2.769x 2025-09-07T10:14:00.9859789Z graph_breaks gmean=6.76x mean=6.769x 2025-09-07T10:14:00.9863052Z unique_graph_breaks gmean=5.00x mean=5.000x 2025-09-07T10:14:00.9866797Z autograd_captures gmean=0.00x mean=0.000x 2025-09-07T10:14:00.9870357Z autograd_compiles gmean=0.00x mean=0.000x 2025-09-07T10:14:00.9873759Z cudagraph_skips gmean=0.00x mean=0.231x 2025-09-07T10:14:00.9874922Z compilation_latency mean=125.198 seconds 2025-09-07T10:14:02.1194165Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *cudagraphs_low_precision-true* ]] 2025-09-07T10:14:02.1195703Z + [[ training == \i\n\f\e\r\e\n\c\e ]] 2025-09-07T10:14:02.1196002Z + for target in "${targets[@]}" 2025-09-07T10:14:02.1196250Z + target_flag=('--performance') 2025-09-07T10:14:02.1196496Z + local target_flag 2025-09-07T10:14:02.1196738Z + [[ performance == \p\e\r\f\o\r\m\a\n\c\e ]] 2025-09-07T10:14:02.1197037Z + target_flag+=(--cold-start-latency) 2025-09-07T10:14:02.1198182Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *freezing-true* ]] 2025-09-07T10:14:02.1200126Z + [[ training-true-inference-true-default-true-dynamic-true-cudagraphs-true-cppwrapper-true-aotinductor-true-freezing_cudagraphs-true-maxautotune-true-freeze_autotune_cudagraphs-true-cudagraphs_low_precision-true == *default-true* ]] 2025-09-07T10:14:02.1202108Z + python benchmarks/dynamo/timm_models.py --performance --cold-start-latency --training --amp --backend inductor --disable-cudagraphs --device cuda --total-partitions 7 --partition-id 6 --output /var/lib/jenkins/workspace/test/test-reports/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_performance.csv 2025-09-07T10:14:03.1526330Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:14:03.1527738Z import pynvml # type: ignore[import] 2025-09-07T10:14:07.9131030Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you. 2025-09-07T10:14:07.9132246Z import pynvml # type: ignore[import] 2025-09-07T10:14:10.9282563Z 2025-09-07T10:14:12.3530379Z loading model: 0it [00:00, ?it/s] 2025-09-07T10:14:12.3531329Z loading model: 0it [00:01, ?it/s] 2025-09-07T10:14:12.3531660Z cuda train selecsls42b 2025-09-07T10:14:44.0470371Z W0907 10:14:44.045000 176690 site-packages/torch/_logging/_internal.py:1199] [6/0] Profiler function will be ignored 2025-09-07T10:15:02.5892661Z 2025-09-07T10:15:02.7463884Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:16:22.6691464Z 2025-09-07T10:16:22.8369758Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:18:52.4693293Z 2025-09-07T10:18:52.6439383Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:23:18.0008328Z 2025-09-07T10:23:18.2395527Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:26:33.4656089Z 2025-09-07T10:26:33.6869512Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:29:24.5468924Z 2025-09-07T10:29:24.7404458Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:30:33.4897823Z 2025-09-07T10:30:33.6411876Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:31:30.4232719Z 2025-09-07T10:31:30.5318814Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:33:18.8989549Z 2025-09-07T10:33:19.0260045Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:37:01.7750679Z 2025-09-07T10:37:02.0307589Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:38:23.3989943Z 2025-09-07T10:38:23.5386230Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:39:47.3816600Z 2025-09-07T10:39:47.5148720Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:42:25.2835381Z 2025-09-07T10:42:25.5837394Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:47:04.9354199Z 2025-09-07T10:47:05.3228764Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:50:26.7543981Z 2025-09-07T10:50:27.2488679Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:53:23.2393790Z 2025-09-07T10:53:23.5591241Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:54:31.2101725Z 2025-09-07T10:54:31.4397491Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:55:28.7507819Z 2025-09-07T10:55:28.9086668Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T10:57:14.2000498Z 2025-09-07T10:57:14.4111397Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:01:05.2854389Z 2025-09-07T11:01:05.7837520Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:02:33.1971669Z 2025-09-07T11:02:33.5028948Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:03:58.7708323Z 2025-09-07T11:03:59.0476037Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:06:40.6277172Z 2025-09-07T11:06:41.0368427Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:11:19.6361078Z 2025-09-07T11:11:20.2385525Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:14:42.7456498Z 2025-09-07T11:14:43.1811371Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:17:48.7370827Z 2025-09-07T11:17:49.1909967Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:18:57.8560932Z 2025-09-07T11:18:58.1690622Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:19:56.4586650Z 2025-09-07T11:19:56.6379451Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:21:44.0713726Z 2025-09-07T11:21:44.3717413Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:25:37.4452569Z 2025-09-07T11:25:38.1708349Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:27:31.7975136Z 2025-09-07T11:27:31.9041186Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:29:47.8647905Z 2025-09-07T11:29:47.9663810Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:34:01.8154237Z 2025-09-07T11:34:02.0162687Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:41:31.7111459Z 2025-09-07T11:41:31.9527667Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:47:29.0615855Z 2025-09-07T11:47:29.2878976Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:52:47.8474558Z 2025-09-07T11:52:48.0447666Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:54:25.9400336Z 2025-09-07T11:54:26.0937310Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:55:49.2373400Z 2025-09-07T11:55:49.3533626Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T11:58:45.5947461Z 2025-09-07T11:58:45.7288643Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:05:52.2723092Z 2025-09-07T12:05:52.5404883Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:07:33.2229543Z 2025-09-07T12:07:33.3607630Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:09:19.8952893Z 2025-09-07T12:09:20.0298632Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:13:41.0349397Z 2025-09-07T12:13:41.3301110Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:19:32.6005310Z 2025-09-07T12:19:32.9938625Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:24:35.4343316Z 2025-09-07T12:24:35.7904699Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:29:18.1796755Z 2025-09-07T12:29:18.5096675Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:30:50.2796332Z 2025-09-07T12:30:50.5060406Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:32:09.6487753Z 2025-09-07T12:32:09.8063806Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:34:57.4617332Z 2025-09-07T12:34:57.6688660Z running benchmark: 0% 0/30 [00:00 will be ignored 2025-09-07T12:39:47.1698486Z 2025-09-07T12:39:47.6415477Z running benchmark: 0% 0/30 [00:00> $GITHUB_ENV 2025-09-07T14:46:43.8657543Z echo "DEVICE_TYPE=$DEVICE_TYPE" >> $GITHUB_ENV 2025-09-07T14:46:43.8671907Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:46:43.8672186Z env: 2025-09-07T14:46:43.8672351Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:46:43.8672606Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:46:43.8672955Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:46:43.8673363Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:46:43.8673699Z ##[endgroup] 2025-09-07T14:46:43.8705733Z + [[ -n '' ]] 2025-09-07T14:46:43.8706016Z + python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-09-07T14:46:44.1463346Z Defaulting to user installation because normal site-packages is not writeable 2025-09-07T14:46:44.8568405Z Collecting boto3==1.35.33 2025-09-07T14:46:44.9259481Z Downloading boto3-1.35.33-py3-none-any.whl (139 kB) 2025-09-07T14:46:44.9584483Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.1/139.1 KB 4.3 MB/s eta 0:00:00 2025-09-07T14:46:45.1021788Z Collecting psutil==7.0.0 2025-09-07T14:46:45.1130463Z Downloading psutil-7.0.0-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (277 kB) 2025-09-07T14:46:45.1408388Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 278.0/278.0 KB 10.4 MB/s eta 0:00:00 2025-09-07T14:46:45.1841444Z Collecting pynvml==12.0.0 2025-09-07T14:46:45.1967867Z Downloading pynvml-12.0.0-py3-none-any.whl (26 kB) 2025-09-07T14:46:45.2258901Z Collecting jmespath<2.0.0,>=0.7.1 2025-09-07T14:46:45.2367371Z Downloading jmespath-1.0.1-py3-none-any.whl (20 kB) 2025-09-07T14:46:45.2730704Z Collecting s3transfer<0.11.0,>=0.10.0 2025-09-07T14:46:45.2836753Z Downloading s3transfer-0.10.4-py3-none-any.whl (83 kB) 2025-09-07T14:46:45.2910235Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 83.2/83.2 KB 13.0 MB/s eta 0:00:00 2025-09-07T14:46:45.9968925Z Collecting botocore<1.36.0,>=1.35.33 2025-09-07T14:46:46.0081097Z Downloading botocore-1.35.99-py3-none-any.whl (13.3 MB) 2025-09-07T14:46:46.1280622Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.3/13.3 MB 113.9 MB/s eta 0:00:00 2025-09-07T14:46:46.1931630Z Collecting nvidia-ml-py<13.0.0a0,>=12.0.0 2025-09-07T14:46:46.2050071Z Downloading nvidia_ml_py-12.575.51-py3-none-any.whl (47 kB) 2025-09-07T14:46:46.2143038Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.5/47.5 KB 5.0 MB/s eta 0:00:00 2025-09-07T14:46:46.2208193Z Requirement already satisfied: urllib3!=2.2.0,<3,>=1.25.4 in /usr/lib/python3/dist-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.26.5) 2025-09-07T14:46:46.2214716Z Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/lib/python3/dist-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (2.8.1) 2025-09-07T14:46:46.4583980Z Installing collected packages: nvidia-ml-py, pynvml, psutil, jmespath, botocore, s3transfer, boto3 2025-09-07T14:46:46.4584835Z Attempting uninstall: nvidia-ml-py 2025-09-07T14:46:46.4589969Z Found existing installation: nvidia-ml-py 11.525.84 2025-09-07T14:46:46.4627304Z Uninstalling nvidia-ml-py-11.525.84: 2025-09-07T14:46:46.4654213Z Successfully uninstalled nvidia-ml-py-11.525.84 2025-09-07T14:46:46.5402892Z Attempting uninstall: psutil 2025-09-07T14:46:46.5409082Z Found existing installation: psutil 5.9.8 2025-09-07T14:46:46.5575717Z Uninstalling psutil-5.9.8: 2025-09-07T14:46:46.5584869Z Successfully uninstalled psutil-5.9.8 2025-09-07T14:46:47.2855173Z Successfully installed boto3-1.35.33 botocore-1.35.99 jmespath-1.0.1 nvidia-ml-py-12.575.51 psutil-7.0.0 pynvml-12.0.0 s3transfer-0.10.4 2025-09-07T14:46:47.3853191Z + DEVICE_NAME= 2025-09-07T14:46:47.3853569Z + DEVICE_TYPE= 2025-09-07T14:46:47.3853916Z + command -v nvidia-smi 2025-09-07T14:46:47.3854669Z + python3 -mpip install torch==2.7.1 2025-09-07T14:46:47.3855421Z /usr/bin/nvidia-smi 2025-09-07T14:46:47.6629560Z Defaulting to user installation because normal site-packages is not writeable 2025-09-07T14:46:47.8808548Z Collecting torch==2.7.1 2025-09-07T14:46:47.9383367Z Downloading torch-2.7.1-cp310-cp310-manylinux_2_28_x86_64.whl (821.2 MB) 2025-09-07T14:47:00.0378417Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 821.2/821.2 MB 1.3 MB/s eta 0:00:00 2025-09-07T14:47:00.9380867Z Collecting nvidia-cublas-cu12==12.6.4.1 2025-09-07T14:47:00.9517051Z Downloading nvidia_cublas_cu12-12.6.4.1-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (393.1 MB) 2025-09-07T14:47:05.9743980Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 393.1/393.1 MB 3.1 MB/s eta 0:00:00 2025-09-07T14:47:06.3738879Z Collecting nvidia-cuda-cupti-cu12==12.6.80 2025-09-07T14:47:06.3865936Z Downloading nvidia_cuda_cupti_cu12-12.6.80-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (8.9 MB) 2025-09-07T14:47:06.4583267Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.9/8.9 MB 128.0 MB/s eta 0:00:00 2025-09-07T14:47:06.4888381Z Collecting nvidia-cufile-cu12==1.11.1.6 2025-09-07T14:47:06.5017494Z Downloading nvidia_cufile_cu12-1.11.1.6-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.1 MB) 2025-09-07T14:47:06.5181075Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 78.1 MB/s eta 0:00:00 2025-09-07T14:47:06.5458658Z Collecting nvidia-nvtx-cu12==12.6.77 2025-09-07T14:47:06.5565390Z Downloading nvidia_nvtx_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89 kB) 2025-09-07T14:47:06.5648476Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 89.3/89.3 KB 11.6 MB/s eta 0:00:00 2025-09-07T14:47:06.5956934Z Collecting jinja2 2025-09-07T14:47:06.6066081Z Downloading jinja2-3.1.6-py3-none-any.whl (134 kB) 2025-09-07T14:47:06.6154349Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.9/134.9 KB 17.7 MB/s eta 0:00:00 2025-09-07T14:47:06.6409707Z Collecting nvidia-cusolver-cu12==11.7.1.2 2025-09-07T14:47:06.6554752Z Downloading nvidia_cusolver_cu12-11.7.1.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (158.2 MB) 2025-09-07T14:47:08.1435963Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 158.2/158.2 MB 13.0 MB/s eta 0:00:00 2025-09-07T14:47:08.3166302Z Collecting nvidia-cusparselt-cu12==0.6.3 2025-09-07T14:47:08.3308541Z Downloading nvidia_cusparselt_cu12-0.6.3-py3-none-manylinux2014_x86_64.whl (156.8 MB) 2025-09-07T14:47:09.7689570Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 156.8/156.8 MB 13.5 MB/s eta 0:00:00 2025-09-07T14:47:09.9424888Z Collecting nvidia-cusparse-cu12==12.5.4.2 2025-09-07T14:47:09.9560921Z Downloading nvidia_cusparse_cu12-12.5.4.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (216.6 MB) 2025-09-07T14:47:12.3023056Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 216.6/216.6 MB 7.3 MB/s eta 0:00:00 2025-09-07T14:47:12.5317990Z Collecting nvidia-nvjitlink-cu12==12.6.85 2025-09-07T14:47:12.5451581Z Downloading nvidia_nvjitlink_cu12-12.6.85-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (19.7 MB) 2025-09-07T14:47:12.6866135Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.7/19.7 MB 103.5 MB/s eta 0:00:00 2025-09-07T14:47:12.7324123Z Collecting nvidia-cuda-runtime-cu12==12.6.77 2025-09-07T14:47:12.7438838Z Downloading nvidia_cuda_runtime_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (897 kB) 2025-09-07T14:47:12.7586309Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 897.7/897.7 KB 69.9 MB/s eta 0:00:00 2025-09-07T14:47:12.7851526Z Collecting nvidia-cuda-nvrtc-cu12==12.6.77 2025-09-07T14:47:12.7993448Z Downloading nvidia_cuda_nvrtc_cu12-12.6.77-py3-none-manylinux2014_x86_64.whl (23.7 MB) 2025-09-07T14:47:12.9666014Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 94.0 MB/s eta 0:00:00 2025-09-07T14:47:13.0189061Z Collecting triton==3.3.1 2025-09-07T14:47:13.0313406Z Downloading triton-3.3.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (155.6 MB) 2025-09-07T14:47:14.4351138Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 155.6/155.6 MB 13.6 MB/s eta 0:00:00 2025-09-07T14:47:14.6305993Z Collecting fsspec 2025-09-07T14:47:14.6409881Z Downloading fsspec-2025.9.0-py3-none-any.whl (199 kB) 2025-09-07T14:47:14.6501298Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 199.3/199.3 KB 26.2 MB/s eta 0:00:00 2025-09-07T14:47:14.6746019Z Collecting nvidia-curand-cu12==10.3.7.77 2025-09-07T14:47:14.6886306Z Downloading nvidia_curand_cu12-10.3.7.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (56.3 MB) 2025-09-07T14:47:15.1600002Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.3/56.3 MB 40.4 MB/s eta 0:00:00 2025-09-07T14:47:15.2372351Z Collecting nvidia-nccl-cu12==2.26.2 2025-09-07T14:47:15.2513040Z Downloading nvidia_nccl_cu12-2.26.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (201.3 MB) 2025-09-07T14:47:17.3467418Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 201.3/201.3 MB 8.3 MB/s eta 0:00:00 2025-09-07T14:47:17.5386365Z Requirement already satisfied: typing-extensions>=4.10.0 in /home/david/.local/lib/python3.10/site-packages (from torch==2.7.1) (4.15.0) 2025-09-07T14:47:17.5800374Z Collecting networkx 2025-09-07T14:47:17.5904366Z Downloading networkx-3.4.2-py3-none-any.whl (1.7 MB) 2025-09-07T14:47:17.6090843Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 101.4 MB/s eta 0:00:00 2025-09-07T14:47:17.6481912Z Collecting filelock 2025-09-07T14:47:17.6583435Z Downloading filelock-3.19.1-py3-none-any.whl (15 kB) 2025-09-07T14:47:17.6858656Z Collecting nvidia-cudnn-cu12==9.5.1.17 2025-09-07T14:47:17.6966895Z Downloading nvidia_cudnn_cu12-9.5.1.17-py3-none-manylinux_2_28_x86_64.whl (571.0 MB) 2025-09-07T14:47:29.7495680Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 571.0/571.0 MB 1.1 MB/s eta 0:00:00 2025-09-07T14:47:30.3112829Z Collecting nvidia-cufft-cu12==11.3.0.4 2025-09-07T14:47:30.3231550Z Downloading nvidia_cufft_cu12-11.3.0.4-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (200.2 MB) 2025-09-07T14:47:32.3868541Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 200.2/200.2 MB 8.3 MB/s eta 0:00:00 2025-09-07T14:47:32.6094334Z Collecting sympy>=1.13.3 2025-09-07T14:47:32.6197155Z Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB) 2025-09-07T14:47:32.6686208Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 134.5 MB/s eta 0:00:00 2025-09-07T14:47:32.7075361Z Requirement already satisfied: setuptools>=40.8.0 in /usr/lib/python3/dist-packages (from triton==3.3.1->torch==2.7.1) (59.6.0) 2025-09-07T14:47:32.7348365Z Collecting mpmath<1.4,>=1.1.0 2025-09-07T14:47:32.7457798Z Downloading mpmath-1.3.0-py3-none-any.whl (536 kB) 2025-09-07T14:47:32.7581121Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 KB 51.1 MB/s eta 0:00:00 2025-09-07T14:47:32.9380238Z Collecting MarkupSafe>=2.0 2025-09-07T14:47:32.9486405Z Downloading MarkupSafe-3.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20 kB) 2025-09-07T14:47:33.2683449Z Installing collected packages: nvidia-cusparselt-cu12, mpmath, triton, sympy, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufile-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, networkx, MarkupSafe, fsspec, filelock, nvidia-cusparse-cu12, nvidia-cufft-cu12, nvidia-cudnn-cu12, jinja2, nvidia-cusolver-cu12, torch 2025-09-07T14:47:38.3702954Z WARNING: The scripts proton and proton-viewer are installed in '/home/david/.local/bin' which is not on PATH. 2025-09-07T14:47:38.3704293Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-09-07T14:47:41.8864525Z WARNING: The script isympy is installed in '/home/david/.local/bin' which is not on PATH. 2025-09-07T14:47:41.8866161Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-09-07T14:48:17.6968583Z WARNING: The scripts torchfrtrace and torchrun are installed in '/home/david/.local/bin' which is not on PATH. 2025-09-07T14:48:17.6969457Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-09-07T14:48:18.2014607Z Successfully installed MarkupSafe-3.0.2 filelock-3.19.1 fsspec-2025.9.0 jinja2-3.1.6 mpmath-1.3.0 networkx-3.4.2 nvidia-cublas-cu12-12.6.4.1 nvidia-cuda-cupti-cu12-12.6.80 nvidia-cuda-nvrtc-cu12-12.6.77 nvidia-cuda-runtime-cu12-12.6.77 nvidia-cudnn-cu12-9.5.1.17 nvidia-cufft-cu12-11.3.0.4 nvidia-cufile-cu12-1.11.1.6 nvidia-curand-cu12-10.3.7.77 nvidia-cusolver-cu12-11.7.1.2 nvidia-cusparse-cu12-12.5.4.2 nvidia-cusparselt-cu12-0.6.3 nvidia-nccl-cu12-2.26.2 nvidia-nvjitlink-cu12-12.6.85 nvidia-nvtx-cu12-12.6.77 sympy-1.14.0 torch-2.7.1 triton-3.3.1 2025-09-07T14:48:18.9007947Z + echo DEVICE_NAME= 2025-09-07T14:48:18.9008480Z + echo DEVICE_TYPE= 2025-09-07T14:48:18.9417408Z ##[group]Run set -eux 2025-09-07T14:48:18.9417620Z set -eux 2025-09-07T14:48:18.9417928Z  2025-09-07T14:48:18.9418223Z if [[ -z "${GITHUB_TOKEN}" ]]; then 2025-09-07T14:48:18.9418497Z  echo "Missing github-token input" 2025-09-07T14:48:18.9418738Z  exit 1 2025-09-07T14:48:18.9418898Z fi 2025-09-07T14:48:18.9433555Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:18.9433901Z env: 2025-09-07T14:48:18.9434097Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:18.9434398Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:18.9434792Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:18.9435379Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:18.9435733Z DEVICE_NAME: 2025-09-07T14:48:18.9435897Z DEVICE_TYPE: 2025-09-07T14:48:18.9436259Z GITHUB_TOKEN: *** 2025-09-07T14:48:18.9436434Z ##[endgroup] 2025-09-07T14:48:18.9909409Z + [[ -z *** ]] 2025-09-07T14:48:19.0347071Z ##[group]Run pytorch/test-infra/.github/actions/get-workflow-job-id@main 2025-09-07T14:48:19.0347416Z with: 2025-09-07T14:48:19.0347808Z github-token: *** 2025-09-07T14:48:19.0347988Z env: 2025-09-07T14:48:19.0348149Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:19.0348401Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:19.0348740Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:19.0349143Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:19.0349486Z DEVICE_NAME: 2025-09-07T14:48:19.0349655Z DEVICE_TYPE: 2025-09-07T14:48:19.0349816Z ##[endgroup] 2025-09-07T14:48:19.1281211Z ##[group]Run set -eux 2025-09-07T14:48:19.1281449Z set -eux 2025-09-07T14:48:19.1281634Z  2025-09-07T14:48:19.1282017Z python3 "${GITHUB_ACTION_PATH}/../../scripts/get_workflow_job_id.py" "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-09-07T14:48:19.1296851Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:19.1297152Z env: 2025-09-07T14:48:19.1297319Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:19.1297570Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:19.1297897Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:19.1298310Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:19.1298650Z DEVICE_NAME: 2025-09-07T14:48:19.1298816Z DEVICE_TYPE: 2025-09-07T14:48:19.1299097Z GITHUB_TOKEN: *** 2025-09-07T14:48:19.1299268Z ##[endgroup] 2025-09-07T14:48:19.1766686Z + python3 /home/david/_work/_actions/pytorch/test-infra/main/.github/actions/get-workflow-job-id/../../scripts/get_workflow_job_id.py 17525296438 i-0d73070610f53945f-1004 2025-09-07T14:48:20.2247446Z setting job-id=49775781836 2025-09-07T14:48:20.2247910Z setting job-name=test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T14:48:20.2606492Z ##[group]Run set -eux 2025-09-07T14:48:20.2606703Z set -eux 2025-09-07T14:48:20.2606890Z  2025-09-07T14:48:20.2607057Z if [[ -n "" ]]; then 2025-09-07T14:48:20.2607256Z  source "" 2025-09-07T14:48:20.2607435Z fi 2025-09-07T14:48:20.2607604Z  2025-09-07T14:48:20.2607900Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_metadata.py" \ 2025-09-07T14:48:20.2608290Z  --schema-version "${SCHEMA_VERSION}" \ 2025-09-07T14:48:20.2608545Z  --repo "${REPO}" \ 2025-09-07T14:48:20.2608769Z  --head-branch "${HEAD_BRANCH}" \ 2025-09-07T14:48:20.2609020Z  --head-sha "${HEAD_SHA}" \ 2025-09-07T14:48:20.2609526Z  --workflow-id "${WORKFLOW_RUN_ID}" \ 2025-09-07T14:48:20.2609817Z  --run-attempt "${RUN_ATTEMPT}" \ 2025-09-07T14:48:20.2610074Z  --job-id "${JOB_ID}" \ 2025-09-07T14:48:20.2610318Z  --job-name "${JOB_NAME}" 2025-09-07T14:48:20.2623637Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:20.2624110Z env: 2025-09-07T14:48:20.2624265Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:20.2624516Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:20.2624846Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:20.2625435Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:20.2625778Z DEVICE_NAME: 2025-09-07T14:48:20.2625951Z DEVICE_TYPE: 2025-09-07T14:48:20.2626117Z SCHEMA_VERSION: v3 2025-09-07T14:48:20.2626304Z REPO: pytorch/pytorch 2025-09-07T14:48:20.2626491Z HEAD_BRANCH: refs/heads/main 2025-09-07T14:48:20.2626732Z HEAD_SHA: 93fb23d6fae7c4e82c4239a1033e522088742634 2025-09-07T14:48:20.2626984Z WORKFLOW_RUN_ID: 17525296438 2025-09-07T14:48:20.2627171Z RUN_ATTEMPT: 1 2025-09-07T14:48:20.2627328Z JOB_ID: 49775781836 2025-09-07T14:48:20.2627612Z JOB_NAME: test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T14:48:20.2627934Z ##[endgroup] 2025-09-07T14:48:20.3097083Z + [[ -n '' ]] 2025-09-07T14:48:20.3098773Z + python3 /home/david/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_metadata.py --schema-version v3 --repo pytorch/pytorch --head-branch refs/heads/main --head-sha 93fb23d6fae7c4e82c4239a1033e522088742634 --workflow-id 17525296438 --run-attempt 1 --job-id 49775781836 --job-name 'test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100)' 2025-09-07T14:48:20.3576567Z ##[group]Run set -eux 2025-09-07T14:48:20.3576772Z set -eux 2025-09-07T14:48:20.3576940Z  2025-09-07T14:48:20.3577102Z if [[ -n "" ]]; then 2025-09-07T14:48:20.3577311Z  source "" 2025-09-07T14:48:20.3577483Z fi 2025-09-07T14:48:20.3577639Z  2025-09-07T14:48:20.3577985Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_runners_info.py" 2025-09-07T14:48:20.3591771Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:20.3592079Z env: 2025-09-07T14:48:20.3592248Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:20.3592498Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:20.3592829Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:20.3593230Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:20.3593579Z DEVICE_NAME: 2025-09-07T14:48:20.3593745Z DEVICE_TYPE: 2025-09-07T14:48:20.3593915Z ##[endgroup] 2025-09-07T14:48:20.4058115Z + [[ -n '' ]] 2025-09-07T14:48:20.4058738Z + python3 /home/david/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_runners_info.py 2025-09-07T14:48:21.1625544Z /home/david/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:276: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:81.) 2025-09-07T14:48:21.1626653Z cpu = _conversion_method_template(device=torch.device("cpu")) 2025-09-07T14:48:22.6117273Z ##[group]Run set -eux 2025-09-07T14:48:22.6117505Z set -eux 2025-09-07T14:48:22.6117686Z  2025-09-07T14:48:22.6117886Z # TODO (huydhn): Implement this part 2025-09-07T14:48:22.6118217Z echo "dependencies={}" >> "${GITHUB_OUTPUT}" 2025-09-07T14:48:22.6132657Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:22.6132955Z env: 2025-09-07T14:48:22.6133116Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:22.6133382Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:22.6133969Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:22.6134400Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:22.6134740Z DEVICE_NAME: 2025-09-07T14:48:22.6134904Z DEVICE_TYPE: 2025-09-07T14:48:22.6135214Z ##[endgroup] 2025-09-07T14:48:22.6607258Z + echo 'dependencies={}' 2025-09-07T14:48:22.7081184Z ##[group]Run set -eux 2025-09-07T14:48:22.7081414Z set -eux 2025-09-07T14:48:22.7081600Z  2025-09-07T14:48:22.7081772Z if [[ -n "" ]]; then 2025-09-07T14:48:22.7081996Z  source "" 2025-09-07T14:48:22.7082183Z fi 2025-09-07T14:48:22.7082357Z  2025-09-07T14:48:22.7082581Z if [[ ! -d "${BENCHMARK_RESULTS_DIR}" ]]; then 2025-09-07T14:48:22.7082946Z  echo "${BENCHMARK_RESULTS_DIR} does not exist, skipping" 2025-09-07T14:48:22.7083396Z  # We don't want the job to fail if the directory doesn't exist 2025-09-07T14:48:22.7083786Z  exit 0 2025-09-07T14:48:22.7084003Z fi 2025-09-07T14:48:22.7084204Z  2025-09-07T14:48:22.7084435Z if [[ "${DRY_RUN}" == "true" ]]; then 2025-09-07T14:48:22.7084885Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-09-07T14:48:22.7085504Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-09-07T14:48:22.7085834Z  --metadata "${BENCHMARK_METADATA}" \ 2025-09-07T14:48:22.7086107Z  --runners "${RUNNER_INFO}" \ 2025-09-07T14:48:22.7086375Z  --dependencies "${DEPENDENCIES}" \ 2025-09-07T14:48:22.7086629Z  --dry-run 2025-09-07T14:48:22.7086819Z else 2025-09-07T14:48:22.7087106Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-09-07T14:48:22.7087543Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-09-07T14:48:22.7087864Z  --metadata "${BENCHMARK_METADATA}" \ 2025-09-07T14:48:22.7088139Z  --runners "${RUNNER_INFO}" \ 2025-09-07T14:48:22.7088396Z  --dependencies "${DEPENDENCIES}" 2025-09-07T14:48:22.7088641Z fi 2025-09-07T14:48:22.7102807Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:22.7103105Z env: 2025-09-07T14:48:22.7103290Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:22.7103545Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:22.7103894Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:22.7104308Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:22.7104658Z DEVICE_NAME: 2025-09-07T14:48:22.7104819Z DEVICE_TYPE: 2025-09-07T14:48:22.7105176Z BENCHMARK_RESULTS_DIR: test/test-reports 2025-09-07T14:48:22.7105414Z DRY_RUN: false 2025-09-07T14:48:22.7106312Z BENCHMARK_METADATA: {"timestamp": 1757256500, "schema_version": "v3", "name": "test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "93fb23d6fae7c4e82c4239a1033e522088742634", "workflow_id": 17525296438, "run_attempt": 1, "job_id": 49775781836} 2025-09-07T14:48:22.7107582Z RUNNER_INFO: [{"cpu_info": "x86_64", "cpu_count": 192, "avail_mem_in_gb": 1999, "extra_info": {"hostname": "92d046649eb1"}, "name": "cuda", "type": "NVIDIA H100 80GB HBM3", "gpu_count": 1, "avail_gpu_mem_in_gb": 79}] 2025-09-07T14:48:22.7108150Z DEPENDENCIES: {} 2025-09-07T14:48:22.7108318Z ##[endgroup] 2025-09-07T14:48:22.7569196Z + [[ -n '' ]] 2025-09-07T14:48:22.7569425Z + [[ ! -d test/test-reports ]] 2025-09-07T14:48:22.7569657Z + [[ false == \t\r\u\e ]] 2025-09-07T14:48:22.7572023Z + python3 /home/david/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py --benchmark-results-dir test/test-reports --metadata '{"timestamp": 1757256500, "schema_version": "v3", "name": "test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "93fb23d6fae7c4e82c4239a1033e522088742634", "workflow_id": 17525296438, "run_attempt": 1, "job_id": 49775781836}' --runners '[{"cpu_info": "x86_64", "cpu_count": 192, "avail_mem_in_gb": 1999, "extra_info": {"hostname": "92d046649eb1"}, "name": "cuda", "type": "NVIDIA H100 80GB HBM3", "gpu_count": 1, "avail_gpu_mem_in_gb": 79}]' --dependencies '{}' 2025-09-07T14:48:22.8808614Z INFO:root:Upload test/test-reports/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_performance.json 2025-09-07T14:48:22.9192900Z INFO:botocore.credentials:Found credentials from IAM Role: gh-ci-github-action-runners-runner-role 2025-09-07T14:48:23.1720072Z INFO:root:Upload test/test-reports/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_accuracy.json 2025-09-07T14:48:23.3149434Z INFO:root:Upload test/test-reports/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_accuracy.json 2025-09-07T14:48:23.4662122Z INFO:root:Upload test/test-reports/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:23.6914392Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:23.8659629Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance.json 2025-09-07T14:48:24.0176072Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_accuracy.json 2025-09-07T14:48:24.1383117Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_accuracy.json 2025-09-07T14:48:24.2700883Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_performance.json 2025-09-07T14:48:24.4480220Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_accuracy.json 2025-09-07T14:48:24.5817473Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_performance.json 2025-09-07T14:48:24.7402135Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:24.9095453Z INFO:root:Upload test/test-reports/inductor_dynamic_timm_models_amp_training_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_dynamic_timm_models_amp_training_cuda_h100_accuracy.json 2025-09-07T14:48:25.0698493Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:25.2722324Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance.json 2025-09-07T14:48:25.4575844Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:25.6391271Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_performance.json 2025-09-07T14:48:25.7920858Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_accuracy.json 2025-09-07T14:48:25.9308000Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_accuracy.json 2025-09-07T14:48:26.0563154Z INFO:root:Upload test/test-reports/inductor_dynamic_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_dynamic_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:26.2341659Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:26.4070635Z INFO:root:Upload test/test-reports/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:26.5993497Z INFO:root:Upload test/test-reports/inductor_max_autotune_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_max_autotune_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:26.7892063Z INFO:root:Upload test/test-reports/inductor_export_timm_models_bfloat16_inference_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_export_timm_models_bfloat16_inference_cuda_h100_accuracy.json 2025-09-07T14:48:26.9369873Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:27.0884306Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:27.2930402Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_performance.json 2025-09-07T14:48:27.4336462Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_accuracy.json 2025-09-07T14:48:27.5909680Z INFO:root:Upload test/test-reports/inductor_max_autotune_timm_models_amp_training_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_max_autotune_timm_models_amp_training_cuda_h100_accuracy.json 2025-09-07T14:48:27.7359830Z INFO:root:Upload test/test-reports/inductor_max_autotune_timm_models_amp_training_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_max_autotune_timm_models_amp_training_cuda_h100_performance.json 2025-09-07T14:48:27.8755445Z INFO:root:Upload test/test-reports/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_accuracy.json 2025-09-07T14:48:27.9852194Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_performance.json 2025-09-07T14:48:28.2030833Z INFO:root:Upload test/test-reports/inductor_dynamic_timm_models_amp_training_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_dynamic_timm_models_amp_training_cuda_h100_performance.json 2025-09-07T14:48:28.3771262Z INFO:root:Upload test/test-reports/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:28.5560918Z INFO:root:Upload test/test-reports/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_performance.json 2025-09-07T14:48:28.7292463Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_accuracy.json 2025-09-07T14:48:28.8642788Z INFO:root:Upload test/test-reports/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_performance.json 2025-09-07T14:48:29.0144046Z INFO:root:Upload test/test-reports/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_performance.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_performance.json 2025-09-07T14:48:29.1592104Z INFO:root:Upload test/test-reports/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_accuracy.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_accuracy.json 2025-09-07T14:48:29.3074180Z INFO:root:Upload test/test-reports/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json to s3://ossci-benchmarks/v3/pytorch/pytorch/17525296438/49775781836/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json 2025-09-07T14:48:29.5436348Z ##[group]Run cat test/**/*_toprint.log || true 2025-09-07T14:48:29.5436674Z cat test/**/*_toprint.log || true 2025-09-07T14:48:29.5451139Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:29.5451433Z env: 2025-09-07T14:48:29.5451596Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:29.5451858Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:29.5452185Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:29.5452601Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:29.5452948Z DEVICE_NAME: 2025-09-07T14:48:29.5453122Z DEVICE_TYPE: 2025-09-07T14:48:29.5453286Z ##[endgroup] 2025-09-07T14:48:29.6014846Z cat: 'test/**/*_toprint.log': No such file or directory 2025-09-07T14:48:29.6849927Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2025-09-07T14:48:29.6850233Z kill "$MONITOR_SCRIPT_PID" 2025-09-07T14:48:29.6864856Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:29.6865307Z env: 2025-09-07T14:48:29.6865477Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:29.6865764Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:29.6866144Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:29.6866636Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:29.6867069Z DEVICE_NAME: 2025-09-07T14:48:29.6867269Z DEVICE_TYPE: 2025-09-07T14:48:29.6867478Z MONITOR_SCRIPT_PID: 9088 2025-09-07T14:48:29.6867704Z ##[endgroup] 2025-09-07T14:48:29.7428909Z Prepare all required actions 2025-09-07T14:48:29.7429305Z Getting action download info 2025-09-07T14:48:29.9194821Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-09-07T14:48:30.6715383Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-09-07T14:48:32.2165473Z ##[group]Run ./.github/actions/upload-test-artifacts 2025-09-07T14:48:32.2165744Z with: 2025-09-07T14:48:32.2166034Z file-suffix: test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836 2025-09-07T14:48:32.2166364Z s3-bucket: gha-artifacts 2025-09-07T14:48:32.2166557Z env: 2025-09-07T14:48:32.2166714Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:32.2166965Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:32.2167298Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:32.2167736Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:32.2168082Z DEVICE_NAME: 2025-09-07T14:48:32.2168244Z DEVICE_TYPE: 2025-09-07T14:48:32.2168407Z ##[endgroup] 2025-09-07T14:48:32.3130218Z ##[group]Run # Remove any previous test jsons if they exist 2025-09-07T14:48:32.3130568Z # Remove any previous test jsons if they exist 2025-09-07T14:48:32.3130849Z rm -f test-jsons-*.zip 2025-09-07T14:48:32.3131154Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test/test-reports -i '*.json' 2025-09-07T14:48:32.3145743Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:32.3146029Z env: 2025-09-07T14:48:32.3146181Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:32.3146454Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:32.3146869Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:32.3147365Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:32.3147806Z DEVICE_NAME: 2025-09-07T14:48:32.3148020Z DEVICE_TYPE: 2025-09-07T14:48:32.3148363Z FILE_SUFFIX: test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836 2025-09-07T14:48:32.3148719Z ##[endgroup] 2025-09-07T14:48:32.3684639Z adding: test/test-reports/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.3707032Z adding: test/test-reports/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.3729317Z adding: test/test-reports/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.3843339Z adding: test/test-reports/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.3955866Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.3988963Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.4011380Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.4034457Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.4067829Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.4090274Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.4123493Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.4218736Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.4241889Z adding: test/test-reports/inductor_dynamic_timm_models_amp_training_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.4344417Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.4377987Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.4488797Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.4522152Z adding: test/test-reports/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.4544404Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.4567673Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.4675625Z adding: test/test-reports/inductor_dynamic_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.4768435Z adding: test/test-reports/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.4862919Z adding: test/test-reports/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.4994801Z adding: test/test-reports/inductor_max_autotune_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.5017488Z adding: test/test-reports/inductor_export_timm_models_bfloat16_inference_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.5110217Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.5218278Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.5252055Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.5274661Z adding: test/test-reports/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.5297187Z adding: test/test-reports/inductor_max_autotune_timm_models_amp_training_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.5331140Z adding: test/test-reports/inductor_max_autotune_timm_models_amp_training_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.5353579Z adding: test/test-reports/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.5387303Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.5420348Z adding: test/test-reports/inductor_dynamic_timm_models_amp_training_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.5508265Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.5541588Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.5564770Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.5598434Z adding: test/test-reports/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.5631726Z adding: test/test-reports/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_performance.json (deflated 99%) 2025-09-07T14:48:32.5653944Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_accuracy.json (deflated 99%) 2025-09-07T14:48:32.5726138Z adding: test/test-reports/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.json (deflated 99%) 2025-09-07T14:48:32.5971240Z ##[group]Run # Remove any previous test reports if they exist 2025-09-07T14:48:32.5971605Z # Remove any previous test reports if they exist 2025-09-07T14:48:32.5971906Z rm -f test-reports-*.zip 2025-09-07T14:48:32.5972250Z zip -r "test-reports-${FILE_SUFFIX}.zip" test/test-reports -i '*.xml' -i '*.csv' 2025-09-07T14:48:32.5986087Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:32.5986380Z env: 2025-09-07T14:48:32.5986571Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:32.5986877Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:32.5987277Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:32.5987788Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:32.5988207Z DEVICE_NAME: 2025-09-07T14:48:32.5988400Z DEVICE_TYPE: 2025-09-07T14:48:32.5988696Z FILE_SUFFIX: test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836 2025-09-07T14:48:32.5989024Z ##[endgroup] 2025-09-07T14:48:32.6495945Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_performance_compilation_metrics.csv (deflated 48%) 2025-09-07T14:48:32.6496918Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_accuracy.csv (deflated 54%) 2025-09-07T14:48:32.6497758Z adding: test/test-reports/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_accuracy.csv (deflated 55%) 2025-09-07T14:48:32.6498674Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.csv (deflated 50%) 2025-09-07T14:48:32.6499623Z adding: test/test-reports/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_performance.csv (deflated 50%) 2025-09-07T14:48:32.6501369Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.csv (deflated 51%) 2025-09-07T14:48:32.6502818Z adding: test/test-reports/inductor_max_autotune_timm_models_amp_training_cuda_h100_performance.csv (deflated 49%) 2025-09-07T14:48:32.6503718Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_performance.csv (deflated 50%) 2025-09-07T14:48:32.6504670Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_performance_compilation_metrics.csv (deflated 48%) 2025-09-07T14:48:32.6505743Z adding: test/test-reports/inductor_cudagraphs_low_precision_timm_models_quant_inference_cuda_h100_accuracy.csv (deflated 56%) 2025-09-07T14:48:32.6506601Z adding: test/test-reports/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_accuracy.csv (deflated 56%) 2025-09-07T14:48:32.6507504Z adding: test/test-reports/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_performance.csv (deflated 50%) 2025-09-07T14:48:32.6508350Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.csv (deflated 49%) 2025-09-07T14:48:32.6509312Z adding: test/test-reports/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.csv (deflated 49%) 2025-09-07T14:48:32.6510213Z adding: test/test-reports/inductor_with_cudagraphs_freezing_autotune_timm_models_bfloat16_inference_cuda_h100_accuracy.csv (deflated 56%) 2025-09-07T14:48:32.6511084Z adding: test/test-reports/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_accuracy.csv (deflated 56%) 2025-09-07T14:48:32.6512389Z adding: test/test-reports/inductor_with_cudagraphs_freezing_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.csv (deflated 50%) 2025-09-07T14:48:32.6513264Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_performance.csv (deflated 49%) 2025-09-07T14:48:32.6514005Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_performance.csv (deflated 49%) 2025-09-07T14:48:32.6514784Z adding: test/test-reports/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_accuracy.csv (deflated 66%) 2025-09-07T14:48:32.6515700Z adding: test/test-reports/inductor_dynamic_timm_models_amp_training_cuda_h100_performance_compilation_metrics.csv (deflated 48%) 2025-09-07T14:48:32.6516527Z adding: test/test-reports/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_performance.csv (deflated 50%) 2025-09-07T14:48:32.6517260Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_accuracy.csv (deflated 56%) 2025-09-07T14:48:32.6517914Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance.csv (deflated 50%) 2025-09-07T14:48:32.6518583Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_accuracy.csv (deflated 56%) 2025-09-07T14:48:32.6519862Z adding: test/test-reports/inductor_max_autotune_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.csv (deflated 51%) 2025-09-07T14:48:32.6520564Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_amp_training_cuda_h100_accuracy.csv (deflated 54%) 2025-09-07T14:48:32.6521198Z adding: test/test-reports/inductor_cpp_wrapper_timm_models_bfloat16_inference_cuda_h100_performance.csv (deflated 50%) 2025-09-07T14:48:32.6523426Z adding: test/test-reports/inductor_max_autotune_timm_models_amp_training_cuda_h100_performance_compilation_metrics.csv (deflated 50%) 2025-09-07T14:48:32.6524126Z adding: test/test-reports/inductor_max_autotune_timm_models_amp_training_cuda_h100_accuracy.csv (deflated 54%) 2025-09-07T14:48:32.6524766Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_bfloat16_inference_cuda_h100_accuracy.csv (deflated 56%) 2025-09-07T14:48:32.6525522Z adding: test/test-reports/inductor_export_timm_models_bfloat16_inference_cuda_h100_accuracy.csv (deflated 57%) 2025-09-07T14:48:32.6526734Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.csv (deflated 50%) 2025-09-07T14:48:32.6527553Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_performance.csv (deflated 49%) 2025-09-07T14:48:32.6528231Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_bfloat16_inference_cuda_h100_performance.csv (deflated 50%) 2025-09-07T14:48:32.6528886Z adding: test/test-reports/inductor_with_cudagraphs_timm_models_amp_training_cuda_h100_accuracy.csv (deflated 54%) 2025-09-07T14:48:32.6530014Z adding: test/test-reports/inductor_no_cudagraphs_timm_models_amp_training_cuda_h100_performance_compilation_metrics.csv (deflated 48%) 2025-09-07T14:48:32.6530724Z adding: test/test-reports/inductor_aot_inductor_timm_models_bfloat16_inference_cuda_h100_performance.csv (deflated 52%) 2025-09-07T14:48:32.6531349Z adding: test/test-reports/inductor_dynamic_timm_models_amp_training_cuda_h100_accuracy.csv (deflated 54%) 2025-09-07T14:48:32.6533042Z adding: test/test-reports/inductor_dynamic_timm_models_bfloat16_inference_cuda_h100_performance_compilation_metrics.csv (deflated 50%) 2025-09-07T14:48:32.6533778Z adding: test/test-reports/inductor_cudagraphs_low_precision_timm_models_quant_inference_cuda_h100_performance.csv (deflated 50%) 2025-09-07T14:48:32.6534448Z adding: test/test-reports/inductor_dynamic_timm_models_amp_training_cuda_h100_performance.csv (deflated 49%) 2025-09-07T14:48:32.6911323Z ##[group]Run # Remove any previous usage logs if they exist 2025-09-07T14:48:32.6911700Z # Remove any previous usage logs if they exist 2025-09-07T14:48:32.6912216Z rm -f logs-*.zip 2025-09-07T14:48:32.6912513Z zip "logs-${FILE_SUFFIX}.zip" 'usage_log.txt' || true 2025-09-07T14:48:32.6912923Z zip -r "logs-${FILE_SUFFIX}.zip" test/test-reports -i '*.log' || true 2025-09-07T14:48:32.6926452Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:32.6926768Z env: 2025-09-07T14:48:32.6926940Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:32.6927187Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:32.6927526Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:32.6927935Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:32.6928278Z DEVICE_NAME: 2025-09-07T14:48:32.6928443Z DEVICE_TYPE: 2025-09-07T14:48:32.6928713Z FILE_SUFFIX: test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836 2025-09-07T14:48:32.6929029Z ##[endgroup] 2025-09-07T14:48:32.7601729Z adding: usage_log.txt (deflated 91%) 2025-09-07T14:48:32.7627886Z 2025-09-07T14:48:32.7628174Z zip error: Nothing to do! (logs-test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836.zip) 2025-09-07T14:48:32.8302622Z ##[group]Run # Remove any previous debugging artifacts if they exist 2025-09-07T14:48:32.8303012Z # Remove any previous debugging artifacts if they exist 2025-09-07T14:48:32.8303314Z rm -f debug-*.zip 2025-09-07T14:48:32.8303546Z if [ -d 'test/debug' ]; then 2025-09-07T14:48:32.8303821Z  zip -r "debug-${FILE_SUFFIX}.zip" test/debug 2025-09-07T14:48:32.8304086Z fi 2025-09-07T14:48:32.8317594Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:32.8317925Z env: 2025-09-07T14:48:32.8318089Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:32.8318340Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:32.8318682Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:32.8319110Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:32.8319453Z DEVICE_NAME: 2025-09-07T14:48:32.8319623Z DEVICE_TYPE: 2025-09-07T14:48:32.8319903Z FILE_SUFFIX: test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836 2025-09-07T14:48:32.8320217Z ##[endgroup] 2025-09-07T14:48:32.9752623Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-09-07T14:48:32.9753112Z with: 2025-09-07T14:48:32.9753297Z s3-bucket: gha-artifacts 2025-09-07T14:48:32.9753560Z s3-prefix: pytorch/pytorch/17525296438/1/artifact 2025-09-07T14:48:32.9753849Z retention-days: 14 2025-09-07T14:48:32.9754063Z if-no-files-found: warn 2025-09-07T14:48:32.9754284Z path: test-jsons-*.zip 2025-09-07T14:48:32.9754490Z name: artifact 2025-09-07T14:48:32.9754678Z region: us-east-1 2025-09-07T14:48:32.9754863Z env: 2025-09-07T14:48:32.9755178Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:32.9755449Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:32.9755838Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:32.9756316Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:32.9756713Z DEVICE_NAME: 2025-09-07T14:48:32.9756890Z DEVICE_TYPE: 2025-09-07T14:48:32.9757073Z ##[endgroup] 2025-09-07T14:48:33.2911244Z NOTE: s3-prefix specified, ignoring name parameter 2025-09-07T14:48:33.2911673Z With the provided path, there will be 1 file uploaded 2025-09-07T14:48:33.2912063Z Uploading to s3 prefix: pytorch/pytorch/17525296438/1/artifact 2025-09-07T14:48:33.2920425Z Starting upload of test-jsons-test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836.zip 2025-09-07T14:48:33.6625870Z Finished upload of test-jsons-test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836.zip 2025-09-07T14:48:33.6941045Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-09-07T14:48:33.6941300Z with: 2025-09-07T14:48:33.6941565Z s3-bucket: gha-artifacts 2025-09-07T14:48:33.6942073Z s3-prefix: pytorch/pytorch/17525296438/1/artifact 2025-09-07T14:48:33.6942343Z retention-days: 14 2025-09-07T14:48:33.6942520Z if-no-files-found: error 2025-09-07T14:48:33.6942726Z path: test-reports-*.zip 2025-09-07T14:48:33.6942922Z name: artifact 2025-09-07T14:48:33.6943106Z region: us-east-1 2025-09-07T14:48:33.6943275Z env: 2025-09-07T14:48:33.6943444Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:33.6943704Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:33.6944042Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:33.6944460Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:33.6944811Z DEVICE_NAME: 2025-09-07T14:48:33.6945136Z DEVICE_TYPE: 2025-09-07T14:48:33.6945306Z ##[endgroup] 2025-09-07T14:48:34.0012356Z NOTE: s3-prefix specified, ignoring name parameter 2025-09-07T14:48:34.0012726Z With the provided path, there will be 1 file uploaded 2025-09-07T14:48:34.0013115Z Uploading to s3 prefix: pytorch/pytorch/17525296438/1/artifact 2025-09-07T14:48:34.0021046Z Starting upload of test-reports-test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836.zip 2025-09-07T14:48:34.1994596Z Finished upload of test-reports-test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836.zip 2025-09-07T14:48:34.2705949Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-09-07T14:48:34.2706245Z with: 2025-09-07T14:48:34.2706417Z s3-bucket: gha-artifacts 2025-09-07T14:48:34.2706670Z s3-prefix: pytorch/pytorch/17525296438/1/artifact 2025-09-07T14:48:34.2706927Z retention-days: 14 2025-09-07T14:48:34.2707144Z if-no-files-found: ignore 2025-09-07T14:48:34.2707370Z path: logs-*.zip 2025-09-07T14:48:34.2707560Z name: artifact 2025-09-07T14:48:34.2707742Z region: us-east-1 2025-09-07T14:48:34.2707934Z env: 2025-09-07T14:48:34.2708111Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:34.2708392Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:34.2708775Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:34.2709242Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:34.2709635Z DEVICE_NAME: 2025-09-07T14:48:34.2709818Z DEVICE_TYPE: 2025-09-07T14:48:34.2709995Z ##[endgroup] 2025-09-07T14:48:34.5793878Z NOTE: s3-prefix specified, ignoring name parameter 2025-09-07T14:48:34.5794587Z With the provided path, there will be 1 file uploaded 2025-09-07T14:48:34.5795132Z Uploading to s3 prefix: pytorch/pytorch/17525296438/1/artifact 2025-09-07T14:48:34.5803374Z Starting upload of logs-test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836.zip 2025-09-07T14:48:34.7771011Z Finished upload of logs-test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836.zip 2025-09-07T14:48:34.8386304Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-09-07T14:48:34.8386587Z with: 2025-09-07T14:48:34.8386774Z s3-bucket: gha-artifacts 2025-09-07T14:48:34.8387066Z s3-prefix: pytorch/pytorch/17525296438/1/artifact 2025-09-07T14:48:34.8387358Z retention-days: 14 2025-09-07T14:48:34.8387568Z if-no-files-found: ignore 2025-09-07T14:48:34.8387798Z path: debug-*.zip 2025-09-07T14:48:34.8387988Z name: artifact 2025-09-07T14:48:34.8388170Z region: us-east-1 2025-09-07T14:48:34.8388351Z env: 2025-09-07T14:48:34.8388520Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:34.8388802Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:34.8389166Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:34.8389629Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:34.8390018Z DEVICE_NAME: 2025-09-07T14:48:34.8390198Z DEVICE_TYPE: 2025-09-07T14:48:34.8390373Z ##[endgroup] 2025-09-07T14:48:35.1450445Z No files were found with the provided path: debug-*.zip. No artifacts will be uploaded. 2025-09-07T14:48:35.1742509Z ##[group]Run # shellcheck disable=SC2156 2025-09-07T14:48:35.1742832Z # shellcheck disable=SC2156 2025-09-07T14:48:35.1743271Z find . -iname "core.[1-9]*" -exec docker exec "${DOCKER_CONTAINER_ID}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2025-09-07T14:48:35.1758038Z shell: /usr/bin/bash -e {0} 2025-09-07T14:48:35.1758252Z env: 2025-09-07T14:48:35.1758416Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:35.1758690Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:35.1759053Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:35.1759468Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:35.1759821Z DEVICE_NAME: 2025-09-07T14:48:35.1759988Z DEVICE_TYPE: 2025-09-07T14:48:35.1760149Z ##[endgroup] 2025-09-07T14:48:35.8743528Z Prepare all required actions 2025-09-07T14:48:35.8743865Z Getting action download info 2025-09-07T14:48:35.9950010Z ##[group]Run ./.github/actions/upload-utilization-stats 2025-09-07T14:48:35.9950281Z with: 2025-09-07T14:48:35.9950442Z job_id: 49775781836 2025-09-07T14:48:35.9950736Z job_name: test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T14:48:35.9951092Z workflow_name: inductor-perf-nightly-h100 2025-09-07T14:48:35.9951355Z workflow_run_id: 17525296438 2025-09-07T14:48:35.9951562Z workflow_attempt: 1 2025-09-07T14:48:35.9951736Z env: 2025-09-07T14:48:35.9951898Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:35.9952154Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:35.9952482Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:35.9952906Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:35.9953263Z DEVICE_NAME: 2025-09-07T14:48:35.9953459Z DEVICE_TYPE: 2025-09-07T14:48:35.9953628Z ##[endgroup] 2025-09-07T14:48:36.0914227Z ##[group]Run echo "workflow_id: 17525296438" 2025-09-07T14:48:36.0914544Z echo "workflow_id: 17525296438" 2025-09-07T14:48:36.0914811Z echo "workflow_attempt: 1" 2025-09-07T14:48:36.0915310Z echo "workflow_Name: inductor-perf-nightly-h100" 2025-09-07T14:48:36.0915631Z echo "job_id: 49775781836" 2025-09-07T14:48:36.0916026Z echo "job_name: test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100)" 2025-09-07T14:48:36.0916450Z echo "artifact_prefix: " 2025-09-07T14:48:36.0916964Z python3 --version 2025-09-07T14:48:36.0931493Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:36.0931788Z env: 2025-09-07T14:48:36.0931953Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:36.0932207Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:36.0932557Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:36.0932968Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:36.0933319Z DEVICE_NAME: 2025-09-07T14:48:36.0933490Z DEVICE_TYPE: 2025-09-07T14:48:36.0933651Z ##[endgroup] 2025-09-07T14:48:36.1405293Z workflow_id: 17525296438 2025-09-07T14:48:36.1405589Z workflow_attempt: 1 2025-09-07T14:48:36.1405865Z workflow_Name: inductor-perf-nightly-h100 2025-09-07T14:48:36.1406172Z job_id: 49775781836 2025-09-07T14:48:36.1406539Z job_name: test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100) 2025-09-07T14:48:36.1406995Z artifact_prefix: 2025-09-07T14:48:36.1423559Z Python 3.10.12 2025-09-07T14:48:36.2332550Z ##[group]Run nick-fields/retry@v3.0.0 2025-09-07T14:48:36.2332856Z with: 2025-09-07T14:48:36.2333048Z shell: bash 2025-09-07T14:48:36.2333286Z timeout_minutes: 5 2025-09-07T14:48:36.2333530Z max_attempts: 5 2025-09-07T14:48:36.2333770Z retry_wait_seconds: 30 2025-09-07T14:48:36.2345331Z command: set -eu python3 -m pip install python-dateutil==2.8.2 boto3==1.35.42 pandas==2.1.3 dataclasses_json==0.6.7 2025-09-07T14:48:36.2345860Z polling_interval_seconds: 1 2025-09-07T14:48:36.2346085Z warning_on_retry: true 2025-09-07T14:48:36.2346290Z continue_on_error: false 2025-09-07T14:48:36.2346479Z env: 2025-09-07T14:48:36.2346632Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:36.2346894Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:36.2347232Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:36.2347693Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:36.2348105Z DEVICE_NAME: 2025-09-07T14:48:36.2348299Z DEVICE_TYPE: 2025-09-07T14:48:36.2348481Z ##[endgroup] 2025-09-07T14:48:36.5820880Z Defaulting to user installation because normal site-packages is not writeable 2025-09-07T14:48:37.1616161Z Collecting python-dateutil==2.8.2 2025-09-07T14:48:37.2183425Z Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) 2025-09-07T14:48:37.3818496Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.7/247.7 KB 1.5 MB/s eta 0:00:00 2025-09-07T14:48:38.0387089Z Collecting boto3==1.35.42 2025-09-07T14:48:38.0502948Z Downloading boto3-1.35.42-py3-none-any.whl (139 kB) 2025-09-07T14:48:38.4159372Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.2/139.2 KB 354.8 kB/s eta 0:00:00 2025-09-07T14:48:38.6880303Z Collecting pandas==2.1.3 2025-09-07T14:48:38.7023843Z Downloading pandas-2.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB) 2025-09-07T14:48:39.7274196Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.3/12.3 MB 11.5 MB/s eta 0:00:00 2025-09-07T14:48:39.7678947Z Requirement already satisfied: dataclasses_json==0.6.7 in /home/david/.local/lib/python3.10/site-packages (0.6.7) 2025-09-07T14:48:39.7694331Z Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil==2.8.2) (1.16.0) 2025-09-07T14:48:39.7739866Z Requirement already satisfied: s3transfer<0.11.0,>=0.10.0 in /home/david/.local/lib/python3.10/site-packages (from boto3==1.35.42) (0.10.4) 2025-09-07T14:48:39.7744611Z Requirement already satisfied: botocore<1.36.0,>=1.35.42 in /home/david/.local/lib/python3.10/site-packages (from boto3==1.35.42) (1.35.99) 2025-09-07T14:48:39.7749383Z Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /home/david/.local/lib/python3.10/site-packages (from boto3==1.35.42) (1.0.1) 2025-09-07T14:48:40.2883786Z Collecting pytz>=2020.1 2025-09-07T14:48:40.2988157Z Downloading pytz-2025.2-py2.py3-none-any.whl (509 kB) 2025-09-07T14:48:40.6938908Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 509.2/509.2 KB 1.3 MB/s eta 0:00:00 2025-09-07T14:48:41.0930152Z Collecting tzdata>=2022.1 2025-09-07T14:48:41.1035238Z Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB) 2025-09-07T14:48:41.4068010Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 347.8/347.8 KB 1.1 MB/s eta 0:00:00 2025-09-07T14:48:42.1059452Z Collecting numpy<2,>=1.22.4 2025-09-07T14:48:42.1168935Z Downloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB) 2025-09-07T14:48:42.9018868Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 14.5 MB/s eta 0:00:00 2025-09-07T14:48:42.9298785Z Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /home/david/.local/lib/python3.10/site-packages (from dataclasses_json==0.6.7) (3.26.1) 2025-09-07T14:48:42.9305958Z Requirement already satisfied: typing-inspect<1,>=0.4.0 in /home/david/.local/lib/python3.10/site-packages (from dataclasses_json==0.6.7) (0.9.0) 2025-09-07T14:48:42.9379142Z Requirement already satisfied: urllib3!=2.2.0,<3,>=1.25.4 in /usr/lib/python3/dist-packages (from botocore<1.36.0,>=1.35.42->boto3==1.35.42) (1.26.5) 2025-09-07T14:48:42.9468356Z Requirement already satisfied: packaging>=17.0 in /home/david/.local/lib/python3.10/site-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses_json==0.6.7) (25.0) 2025-09-07T14:48:42.9570880Z Requirement already satisfied: mypy-extensions>=0.3.0 in /home/david/.local/lib/python3.10/site-packages (from typing-inspect<1,>=0.4.0->dataclasses_json==0.6.7) (1.1.0) 2025-09-07T14:48:42.9575814Z Requirement already satisfied: typing-extensions>=3.7.4 in /home/david/.local/lib/python3.10/site-packages (from typing-inspect<1,>=0.4.0->dataclasses_json==0.6.7) (4.15.0) 2025-09-07T14:48:43.2491591Z Installing collected packages: pytz, tzdata, python-dateutil, numpy, pandas, boto3 2025-09-07T14:48:46.9062175Z WARNING: The script f2py is installed in '/home/david/.local/bin' which is not on PATH. 2025-09-07T14:48:46.9062936Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-09-07T14:48:50.9000561Z Attempting uninstall: boto3 2025-09-07T14:48:50.9007323Z Found existing installation: boto3 1.35.33 2025-09-07T14:48:50.9229656Z Uninstalling boto3-1.35.33: 2025-09-07T14:48:50.9249622Z Successfully uninstalled boto3-1.35.33 2025-09-07T14:48:51.5561726Z Successfully installed boto3-1.35.42 numpy-1.26.4 pandas-2.1.3 python-dateutil-2.8.2 pytz-2025.2 tzdata-2025.2 2025-09-07T14:48:52.3190554Z Command completed after 1 attempt(s). 2025-09-07T14:48:52.3261808Z ##[group]Run python3 -m tools.stats.upload_utilization_stats.upload_utilization_stats \ 2025-09-07T14:48:52.3264058Z python3 -m tools.stats.upload_utilization_stats.upload_utilization_stats \ 2025-09-07T14:48:52.3264454Z  --workflow-run-id "17525296438" \ 2025-09-07T14:48:52.3264756Z  --workflow-name "inductor-perf-nightly-h100" \ 2025-09-07T14:48:52.3265238Z  --workflow-run-attempt "1" \ 2025-09-07T14:48:52.3265669Z  --job-id "49775781836" \ 2025-09-07T14:48:52.3266044Z  --job-name "test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100)" \ 2025-09-07T14:48:52.3266414Z  --local-path "" \ 2025-09-07T14:48:52.3266623Z  --artifact-prefix "" 2025-09-07T14:48:52.3281020Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-09-07T14:48:52.3281315Z env: 2025-09-07T14:48:52.3281487Z GIT_DEFAULT_BRANCH: main 2025-09-07T14:48:52.3281742Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-09-07T14:48:52.3282095Z SCCACHE_SERVER_PORT_DOCKER_FLAG: -e SCCACHE_SERVER_PORT=5230 2025-09-07T14:48:52.3282514Z DOCKER_CONTAINER_ID: 146d7de0d6332085825694a17496a2071735879dc5f5418a0692a48b9009ad7f 2025-09-07T14:48:52.3282871Z DEVICE_NAME: 2025-09-07T14:48:52.3283057Z DEVICE_TYPE: 2025-09-07T14:48:52.3283215Z ##[endgroup] 2025-09-07T14:48:55.8349687Z repo: pytorch/pytorch 2025-09-07T14:48:55.8350275Z Search for test log in s3 bucket: ossci-utilization 2025-09-07T14:48:55.8351558Z Downloading logs-test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836.zip 2025-09-07T14:48:55.8352632Z extracting usage_log.txt from zip file logs-test-inductor_timm_perf_cuda_h100-7-7-linux.aws.h100_49775781836.zip 2025-09-07T14:48:55.8353622Z Converted Log Model: UtilizationMetadata: 2025-09-07T14:48:55.8356067Z UtilizationMetadata(level='metadata', workflow_id='17525296438', job_id='49775781836', workflow_name='inductor-perf-nightly-h100', job_name='test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100)', usage_collect_interval=4.0, data_model_version=1.5, start_at=1757232544, gpu_count=1, cpu_count=192, gpu_type='pynvml', error=None) 2025-09-07T14:48:55.8357309Z [Db Segments] detected pytest cmd: 3, generated segments: 3 2025-09-07T14:48:55.8357634Z [db model] Peek db timeseries 2025-09-07T14:48:55.8357848Z :{ 2025-09-07T14:48:55.8358030Z "created_at": 1757256535, 2025-09-07T14:48:55.8358254Z "type": "utilization", 2025-09-07T14:48:55.8358472Z "tags": [ 2025-09-07T14:48:55.8358649Z "record" 2025-09-07T14:48:55.8358819Z ], 2025-09-07T14:48:55.8358992Z "time_stamp": 1757232544, 2025-09-07T14:48:55.8359218Z "repo": "pytorch/pytorch", 2025-09-07T14:48:55.8359446Z "workflow_id": 17525296438, 2025-09-07T14:48:55.8359661Z "run_attempt": 1, 2025-09-07T14:48:55.8359865Z "job_id": 49775781836, 2025-09-07T14:48:55.8360130Z "workflow_name": "inductor-perf-nightly-h100", 2025-09-07T14:48:55.8360559Z "job_name": "test-weekly / test (inductor_timm_perf_cuda_h100, 7, 7, linux.aws.h100)", 2025-09-07T14:48:55.8360933Z "json_data": "{}" 2025-09-07T14:48:55.8361122Z } 2025-09-07T14:48:55.8361530Z Writing 1 documents to S3 ossci-utilization/util_metadata/v_1.5/pytorch/pytorch/17525296438/1/49775781836/metadata 2025-09-07T14:48:55.8362290Z Done! Finish writing document to S3 ossci-utilization/util_metadata/v_1.5/pytorch/pytorch/17525296438/1/49775781836/metadata 2025-09-07T14:48:55.8363072Z Writing 1590 documents to S3 ossci-utilization/util_timeseries/v_1.5/pytorch/pytorch/17525296438/1/49775781836/time_series 2025-09-07T14:48:55.8363838Z Done! Finish writing document to S3 ossci-utilization/util_timeseries/v_1.5/pytorch/pytorch/17525296438/1/49775781836/time_series 2025-09-07T14:48:55.9289708Z Post job cleanup. 2025-09-07T14:48:55.9323262Z Post job cleanup. 2025-09-07T14:48:56.0246635Z [command]/usr/bin/git version 2025-09-07T14:48:56.0284436Z git version 2.50.1 2025-09-07T14:48:56.0323654Z Temporarily overriding HOME='/home/david/_work/_temp/270c63d0-4b5b-40d8-9a9c-706c62c9f236' before making global git config changes 2025-09-07T14:48:56.0324344Z Adding repository directory to the temporary git global config as a safe directory 2025-09-07T14:48:56.0328489Z [command]/usr/bin/git config --global --add safe.directory /home/david/_work/pytorch/pytorch 2025-09-07T14:48:56.0364236Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-09-07T14:48:56.0407983Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-09-07T14:48:56.0690869Z Entering 'android/libs/fbjni' 2025-09-07T14:48:56.0742154Z Entering 'third_party/FP16' 2025-09-07T14:48:56.0791470Z Entering 'third_party/FXdiv' 2025-09-07T14:48:56.0840929Z Entering 'third_party/NNPACK' 2025-09-07T14:48:56.0891023Z Entering 'third_party/NVTX' 2025-09-07T14:48:56.0940895Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T14:48:56.0990543Z Entering 'third_party/XNNPACK' 2025-09-07T14:48:56.1054442Z Entering 'third_party/aiter' 2025-09-07T14:48:56.1106278Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T14:48:56.1166957Z Entering 'third_party/benchmark' 2025-09-07T14:48:56.1217684Z Entering 'third_party/composable_kernel' 2025-09-07T14:48:56.1276753Z Entering 'third_party/cpp-httplib' 2025-09-07T14:48:56.1327733Z Entering 'third_party/cpuinfo' 2025-09-07T14:48:56.1379264Z Entering 'third_party/cudnn_frontend' 2025-09-07T14:48:56.1430687Z Entering 'third_party/cutlass' 2025-09-07T14:48:56.1489770Z Entering 'third_party/fbgemm' 2025-09-07T14:48:56.1542623Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T14:48:56.1593348Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T14:48:56.1650544Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T14:48:56.1700397Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T14:48:56.1758054Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T14:48:56.1807054Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T14:48:56.1854607Z Entering 'third_party/fbgemm/external/json' 2025-09-07T14:48:56.1907296Z Entering 'third_party/flash-attention' 2025-09-07T14:48:56.1957793Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T14:48:56.2015981Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T14:48:56.2072851Z Entering 'third_party/flatbuffers' 2025-09-07T14:48:56.2125404Z Entering 'third_party/fmt' 2025-09-07T14:48:56.2178256Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T14:48:56.2228816Z Entering 'third_party/gloo' 2025-09-07T14:48:56.2279594Z Entering 'third_party/googletest' 2025-09-07T14:48:56.2330168Z Entering 'third_party/ideep' 2025-09-07T14:48:56.2377969Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T14:48:56.2434799Z Entering 'third_party/ittapi' 2025-09-07T14:48:56.2484167Z Entering 'third_party/kineto' 2025-09-07T14:48:56.2536452Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T14:48:56.2584827Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T14:48:56.2635512Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T14:48:56.2683272Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T14:48:56.2731047Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T14:48:56.2777854Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T14:48:56.2830491Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T14:48:56.2879698Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T14:48:56.2928965Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T14:48:56.2978968Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T14:48:56.3030828Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T14:48:56.3079516Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T14:48:56.3131579Z Entering 'third_party/kleidiai' 2025-09-07T14:48:56.3185106Z Entering 'third_party/mimalloc' 2025-09-07T14:48:56.3234417Z Entering 'third_party/nlohmann' 2025-09-07T14:48:56.3284627Z Entering 'third_party/onnx' 2025-09-07T14:48:56.3349444Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T14:48:56.3403790Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T14:48:56.3455273Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T14:48:56.3503327Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T14:48:56.3554776Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T14:48:56.3602615Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T14:48:56.3652589Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T14:48:56.3701514Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T14:48:56.3749649Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T14:48:56.3796061Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T14:48:56.3846351Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T14:48:56.3898389Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T14:48:56.3966146Z Entering 'third_party/pocketfft' 2025-09-07T14:48:56.4015758Z Entering 'third_party/protobuf' 2025-09-07T14:48:56.4067079Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T14:48:56.4117131Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T14:48:56.4168554Z Entering 'third_party/psimd' 2025-09-07T14:48:56.4219144Z Entering 'third_party/pthreadpool' 2025-09-07T14:48:56.4268899Z Entering 'third_party/pybind11' 2025-09-07T14:48:56.4318631Z Entering 'third_party/python-peachpy' 2025-09-07T14:48:56.4368003Z Entering 'third_party/sleef' 2025-09-07T14:48:56.4418057Z Entering 'third_party/tensorpipe' 2025-09-07T14:48:56.4466711Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T14:48:56.4514231Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T14:48:56.4562028Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T14:48:56.4609372Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T14:48:56.4655401Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T14:48:56.4729615Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-09-07T14:48:56.4755875Z http.https://github.com/.extraheader 2025-09-07T14:48:56.4765665Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-09-07T14:48:56.4799433Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-09-07T14:48:56.5066911Z Entering 'android/libs/fbjni' 2025-09-07T14:48:56.5096599Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5132192Z Entering 'third_party/FP16' 2025-09-07T14:48:56.5159939Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5194775Z Entering 'third_party/FXdiv' 2025-09-07T14:48:56.5223428Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5258405Z Entering 'third_party/NNPACK' 2025-09-07T14:48:56.5286595Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5326094Z Entering 'third_party/NVTX' 2025-09-07T14:48:56.5354133Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5392042Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T14:48:56.5419460Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5456942Z Entering 'third_party/XNNPACK' 2025-09-07T14:48:56.5484627Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5536878Z Entering 'third_party/aiter' 2025-09-07T14:48:56.5566214Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5604388Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T14:48:56.5631764Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5676596Z Entering 'third_party/benchmark' 2025-09-07T14:48:56.5705188Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5741755Z Entering 'third_party/composable_kernel' 2025-09-07T14:48:56.5770142Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5813469Z Entering 'third_party/cpp-httplib' 2025-09-07T14:48:56.5842623Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5878121Z Entering 'third_party/cpuinfo' 2025-09-07T14:48:56.5908036Z http.https://github.com/.extraheader 2025-09-07T14:48:56.5947340Z Entering 'third_party/cudnn_frontend' 2025-09-07T14:48:56.5974822Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6012591Z Entering 'third_party/cutlass' 2025-09-07T14:48:56.6040885Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6086362Z Entering 'third_party/fbgemm' 2025-09-07T14:48:56.6114510Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6152247Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T14:48:56.6179291Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6215950Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T14:48:56.6241971Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6286076Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T14:48:56.6312525Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6348089Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T14:48:56.6374501Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6419509Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T14:48:56.6446561Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6483100Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T14:48:56.6509758Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6545090Z Entering 'third_party/fbgemm/external/json' 2025-09-07T14:48:56.6571960Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6612559Z Entering 'third_party/flash-attention' 2025-09-07T14:48:56.6641375Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6678397Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T14:48:56.6704777Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6746234Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T14:48:56.6772428Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6819242Z Entering 'third_party/flatbuffers' 2025-09-07T14:48:56.6847697Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6887531Z Entering 'third_party/fmt' 2025-09-07T14:48:56.6917299Z http.https://github.com/.extraheader 2025-09-07T14:48:56.6954375Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T14:48:56.6983113Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7020938Z Entering 'third_party/gloo' 2025-09-07T14:48:56.7048459Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7086152Z Entering 'third_party/googletest' 2025-09-07T14:48:56.7114214Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7152011Z Entering 'third_party/ideep' 2025-09-07T14:48:56.7179538Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7215841Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T14:48:56.7241888Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7286676Z Entering 'third_party/ittapi' 2025-09-07T14:48:56.7314591Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7350208Z Entering 'third_party/kineto' 2025-09-07T14:48:56.7378969Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7414356Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T14:48:56.7442811Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7478631Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T14:48:56.7505461Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7544708Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T14:48:56.7572039Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7610061Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T14:48:56.7637120Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7674046Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T14:48:56.7700736Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7735985Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T14:48:56.7763732Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7802624Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T14:48:56.7829358Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7865779Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T14:48:56.7893156Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7930267Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T14:48:56.7957797Z http.https://github.com/.extraheader 2025-09-07T14:48:56.7995816Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T14:48:56.8023074Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8063913Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T14:48:56.8090482Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8126919Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T14:48:56.8152304Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8191741Z Entering 'third_party/kleidiai' 2025-09-07T14:48:56.8220072Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8257708Z Entering 'third_party/mimalloc' 2025-09-07T14:48:56.8286368Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8323866Z Entering 'third_party/nlohmann' 2025-09-07T14:48:56.8352019Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8390448Z Entering 'third_party/onnx' 2025-09-07T14:48:56.8419224Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8470843Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T14:48:56.8498918Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8540504Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T14:48:56.8568991Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8606415Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T14:48:56.8632597Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8668769Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T14:48:56.8695306Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8731384Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T14:48:56.8757380Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8792289Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T14:48:56.8818916Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8855489Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T14:48:56.8881891Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8917334Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T14:48:56.8943521Z http.https://github.com/.extraheader 2025-09-07T14:48:56.8979743Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T14:48:56.9006749Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9040982Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T14:48:56.9067740Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9105569Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T14:48:56.9131928Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9170608Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T14:48:56.9197213Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9254158Z Entering 'third_party/pocketfft' 2025-09-07T14:48:56.9282639Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9319942Z Entering 'third_party/protobuf' 2025-09-07T14:48:56.9348735Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9386900Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T14:48:56.9413877Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9450402Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T14:48:56.9479571Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9519359Z Entering 'third_party/psimd' 2025-09-07T14:48:56.9548410Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9585992Z Entering 'third_party/pthreadpool' 2025-09-07T14:48:56.9614517Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9650759Z Entering 'third_party/pybind11' 2025-09-07T14:48:56.9679601Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9717327Z Entering 'third_party/python-peachpy' 2025-09-07T14:48:56.9744811Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9781668Z Entering 'third_party/sleef' 2025-09-07T14:48:56.9810062Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9847090Z Entering 'third_party/tensorpipe' 2025-09-07T14:48:56.9876019Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9912275Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T14:48:56.9939737Z http.https://github.com/.extraheader 2025-09-07T14:48:56.9976386Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T14:48:57.0001819Z http.https://github.com/.extraheader 2025-09-07T14:48:57.0037709Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T14:48:57.0063427Z http.https://github.com/.extraheader 2025-09-07T14:48:57.0099995Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T14:48:57.0126336Z http.https://github.com/.extraheader 2025-09-07T14:48:57.0160676Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T14:48:57.0187694Z http.https://github.com/.extraheader 2025-09-07T14:48:57.0374646Z Post job cleanup. 2025-09-07T14:48:57.1300837Z [command]/usr/bin/git version 2025-09-07T14:48:57.1338599Z git version 2.50.1 2025-09-07T14:48:57.1378710Z Temporarily overriding HOME='/home/david/_work/_temp/826dbc8a-fe74-483f-a488-8aefc0331a11' before making global git config changes 2025-09-07T14:48:57.1379415Z Adding repository directory to the temporary git global config as a safe directory 2025-09-07T14:48:57.1383439Z [command]/usr/bin/git config --global --add safe.directory /home/david/_work/pytorch/pytorch 2025-09-07T14:48:57.1418552Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-09-07T14:48:57.1461601Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-09-07T14:48:57.1735524Z Entering 'android/libs/fbjni' 2025-09-07T14:48:57.1786100Z Entering 'third_party/FP16' 2025-09-07T14:48:57.1835752Z Entering 'third_party/FXdiv' 2025-09-07T14:48:57.1885709Z Entering 'third_party/NNPACK' 2025-09-07T14:48:57.1937366Z Entering 'third_party/NVTX' 2025-09-07T14:48:57.1988596Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T14:48:57.2037873Z Entering 'third_party/XNNPACK' 2025-09-07T14:48:57.2104854Z Entering 'third_party/aiter' 2025-09-07T14:48:57.2156277Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T14:48:57.2213957Z Entering 'third_party/benchmark' 2025-09-07T14:48:57.2266940Z Entering 'third_party/composable_kernel' 2025-09-07T14:48:57.2328540Z Entering 'third_party/cpp-httplib' 2025-09-07T14:48:57.2378961Z Entering 'third_party/cpuinfo' 2025-09-07T14:48:57.2429063Z Entering 'third_party/cudnn_frontend' 2025-09-07T14:48:57.2479072Z Entering 'third_party/cutlass' 2025-09-07T14:48:57.2536828Z Entering 'third_party/fbgemm' 2025-09-07T14:48:57.2588407Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T14:48:57.2636982Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T14:48:57.2691238Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T14:48:57.2738728Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T14:48:57.2794901Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T14:48:57.2841819Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T14:48:57.2888850Z Entering 'third_party/fbgemm/external/json' 2025-09-07T14:48:57.2939591Z Entering 'third_party/flash-attention' 2025-09-07T14:48:57.2990213Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T14:48:57.3043729Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T14:48:57.3100579Z Entering 'third_party/flatbuffers' 2025-09-07T14:48:57.3152682Z Entering 'third_party/fmt' 2025-09-07T14:48:57.3202305Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T14:48:57.3252065Z Entering 'third_party/gloo' 2025-09-07T14:48:57.3302446Z Entering 'third_party/googletest' 2025-09-07T14:48:57.3352852Z Entering 'third_party/ideep' 2025-09-07T14:48:57.3400911Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T14:48:57.3458466Z Entering 'third_party/ittapi' 2025-09-07T14:48:57.3508569Z Entering 'third_party/kineto' 2025-09-07T14:48:57.3557844Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T14:48:57.3605321Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T14:48:57.3655599Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T14:48:57.3703538Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T14:48:57.3750972Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T14:48:57.3797612Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T14:48:57.3848699Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T14:48:57.3896730Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T14:48:57.3944402Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T14:48:57.3993611Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T14:48:57.4043966Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T14:48:57.4091682Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T14:48:57.4141640Z Entering 'third_party/kleidiai' 2025-09-07T14:48:57.4192554Z Entering 'third_party/mimalloc' 2025-09-07T14:48:57.4242582Z Entering 'third_party/nlohmann' 2025-09-07T14:48:57.4293803Z Entering 'third_party/onnx' 2025-09-07T14:48:57.4357777Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T14:48:57.4411346Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T14:48:57.4464402Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T14:48:57.4512408Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T14:48:57.4559468Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T14:48:57.4606166Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T14:48:57.4653755Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T14:48:57.4700266Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T14:48:57.4747290Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T14:48:57.4793196Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T14:48:57.4842317Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T14:48:57.4893461Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T14:48:57.4962795Z Entering 'third_party/pocketfft' 2025-09-07T14:48:57.5012485Z Entering 'third_party/protobuf' 2025-09-07T14:48:57.5063959Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T14:48:57.5111061Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T14:48:57.5161178Z Entering 'third_party/psimd' 2025-09-07T14:48:57.5210603Z Entering 'third_party/pthreadpool' 2025-09-07T14:48:57.5260424Z Entering 'third_party/pybind11' 2025-09-07T14:48:57.5311202Z Entering 'third_party/python-peachpy' 2025-09-07T14:48:57.5361875Z Entering 'third_party/sleef' 2025-09-07T14:48:57.5412158Z Entering 'third_party/tensorpipe' 2025-09-07T14:48:57.5462276Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T14:48:57.5509865Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T14:48:57.5556933Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T14:48:57.5604591Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T14:48:57.5649931Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T14:48:57.5723630Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-09-07T14:48:57.5756384Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-09-07T14:48:57.6019152Z Entering 'android/libs/fbjni' 2025-09-07T14:48:57.6069705Z Entering 'third_party/FP16' 2025-09-07T14:48:57.6119987Z Entering 'third_party/FXdiv' 2025-09-07T14:48:57.6171038Z Entering 'third_party/NNPACK' 2025-09-07T14:48:57.6221742Z Entering 'third_party/NVTX' 2025-09-07T14:48:57.6272933Z Entering 'third_party/VulkanMemoryAllocator' 2025-09-07T14:48:57.6323453Z Entering 'third_party/XNNPACK' 2025-09-07T14:48:57.6388811Z Entering 'third_party/aiter' 2025-09-07T14:48:57.6439797Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-09-07T14:48:57.6498450Z Entering 'third_party/benchmark' 2025-09-07T14:48:57.6548426Z Entering 'third_party/composable_kernel' 2025-09-07T14:48:57.6607004Z Entering 'third_party/cpp-httplib' 2025-09-07T14:48:57.6657615Z Entering 'third_party/cpuinfo' 2025-09-07T14:48:57.6708861Z Entering 'third_party/cudnn_frontend' 2025-09-07T14:48:57.6759087Z Entering 'third_party/cutlass' 2025-09-07T14:48:57.6818374Z Entering 'third_party/fbgemm' 2025-09-07T14:48:57.6870616Z Entering 'third_party/fbgemm/external/asmjit' 2025-09-07T14:48:57.6918043Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-09-07T14:48:57.6972036Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-09-07T14:48:57.7019696Z Entering 'third_party/fbgemm/external/cutlass' 2025-09-07T14:48:57.7076080Z Entering 'third_party/fbgemm/external/googletest' 2025-09-07T14:48:57.7122896Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-09-07T14:48:57.7169982Z Entering 'third_party/fbgemm/external/json' 2025-09-07T14:48:57.7221392Z Entering 'third_party/flash-attention' 2025-09-07T14:48:57.7271482Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-09-07T14:48:57.7324268Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-09-07T14:48:57.7381337Z Entering 'third_party/flatbuffers' 2025-09-07T14:48:57.7433915Z Entering 'third_party/fmt' 2025-09-07T14:48:57.7484067Z Entering 'third_party/gemmlowp/gemmlowp' 2025-09-07T14:48:57.7534261Z Entering 'third_party/gloo' 2025-09-07T14:48:57.7583722Z Entering 'third_party/googletest' 2025-09-07T14:48:57.7633624Z Entering 'third_party/ideep' 2025-09-07T14:48:57.7682209Z Entering 'third_party/ideep/mkl-dnn' 2025-09-07T14:48:57.7737737Z Entering 'third_party/ittapi' 2025-09-07T14:48:57.7788498Z Entering 'third_party/kineto' 2025-09-07T14:48:57.7838843Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-09-07T14:48:57.7885903Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-09-07T14:48:57.7934566Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-09-07T14:48:57.7982820Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-09-07T14:48:57.8030421Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-09-07T14:48:57.8077285Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-09-07T14:48:57.8128711Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-09-07T14:48:57.8176102Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-09-07T14:48:57.8223781Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-09-07T14:48:57.8272766Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-09-07T14:48:57.8323037Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-09-07T14:48:57.8371163Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-09-07T14:48:57.8421518Z Entering 'third_party/kleidiai' 2025-09-07T14:48:57.8472915Z Entering 'third_party/mimalloc' 2025-09-07T14:48:57.8522225Z Entering 'third_party/nlohmann' 2025-09-07T14:48:57.8575226Z Entering 'third_party/onnx' 2025-09-07T14:48:57.8638911Z Entering 'third_party/onnx/third_party/pybind11' 2025-09-07T14:48:57.8692739Z Entering 'third_party/opentelemetry-cpp' 2025-09-07T14:48:57.8743163Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-09-07T14:48:57.8791311Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-09-07T14:48:57.8838845Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-09-07T14:48:57.8886088Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-09-07T14:48:57.8934002Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-09-07T14:48:57.8980938Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-09-07T14:48:57.9028652Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-09-07T14:48:57.9075418Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-09-07T14:48:57.9125501Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-09-07T14:48:57.9175614Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-09-07T14:48:57.9243169Z Entering 'third_party/pocketfft' 2025-09-07T14:48:57.9292933Z Entering 'third_party/protobuf' 2025-09-07T14:48:57.9343601Z Entering 'third_party/protobuf/third_party/benchmark' 2025-09-07T14:48:57.9391253Z Entering 'third_party/protobuf/third_party/googletest' 2025-09-07T14:48:57.9443126Z Entering 'third_party/psimd' 2025-09-07T14:48:57.9492818Z Entering 'third_party/pthreadpool' 2025-09-07T14:48:57.9541331Z Entering 'third_party/pybind11' 2025-09-07T14:48:57.9590899Z Entering 'third_party/python-peachpy' 2025-09-07T14:48:57.9639502Z Entering 'third_party/sleef' 2025-09-07T14:48:57.9688668Z Entering 'third_party/tensorpipe' 2025-09-07T14:48:57.9739286Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-09-07T14:48:57.9786838Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-09-07T14:48:57.9832870Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-09-07T14:48:57.9880262Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-09-07T14:48:57.9925511Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-09-07T14:48:58.0117420Z Cleaning up orphan processes